Fundamentals of Generative AI Flashcards

Explore how generative AI and large language models function, including prompt engineering and common challenges like hallucinations. (11 cards)

1
Q

In the context of generative AI models, what are tokens?

  1. Tokens are the smallest units of text, such as words or subwords, that a generative model processes during input and output.
  2. Tokens are the numerical encodings of concepts or terms that a generative model uses in its computations.
  3. Tokens are the model’s learned parameters that are adjusted during fine-tuning for different tasks.
  4. Tokens are the commands or instructions that guide the model to produce a specific response.
A

1. Tokens are the smallest units of text, such as words or subwords, that a generative model processes during input and output.

In generative AI, tokens are the core building blocks that represent segments of text. These units could be full words, subwords, or even characters, depending on the model. The model processes input as sequences of tokens to generate predictions or responses.

  • Tokens are the numerical encodings of concepts or terms that a generative model uses in its computations is incorrect because tokens are not the numerical encodings themselves. They are converted into numerical representations (embeddings), which the model uses during its computations.
  • Tokens are the model’s learned parameters that are adjusted during fine-tuning for different tasks is incorrect because tokens are not parameters of the model. Parameters refer to the internal weights of the model, whereas tokens are the pieces of text that the model uses as input.
  • Tokens are the commands or instructions that guide the model to produce a specific response is incorrect because tokens are not commands or instructions but the basic elements (words or subwords) processed by the model. Prompts, not tokens, represent the specific instructions given to the model.

Reference:
Transform your business with generative AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A company is developing an AI application that uses language models for inference on edge devices. The company needs the lowest possible latency for real-time predictions.

Which approach should the company take?

  1. Deploy optimized small language models (SLMs) on edge devices.
  2. Deploy large-scale language models (LLMs) on edge devices with sufficient processing power.
  3. Use cloud-hosted small language models (SLMs) through an API for communication with edge devices.
  4. Use cloud-hosted large language models (LLMs) through an API for real-time predictions.
A

1. Deploy optimized small language models (SLMs) on edge devices.

This is correct because deploying small, optimized models directly onto edge devices minimizes latency by avoiding network calls. Small models are specifically designed to run efficiently on devices with limited computational power, ensuring real-time performance.

  • Deploy large-scale language models (LLMs) on edge devices with sufficient processing power is incorrect because large language models require significantly more memory and compute resources, which are typically not available on edge devices. Even if deployed, they would result in higher latency due to slower inference times.
  • Use cloud-hosted small language models (SLMs) through an API for communication with edge devices is incorrect because even though small models require fewer resources, using a cloud API introduces network latency, which can delay predictions compared to running models locally on the device.
  • Use cloud-hosted large language models (LLMs) through an API for real-time predictions is incorrect because large models are computationally intensive, and network latency from communicating with a cloud-hosted LLM would increase the time required to get predictions, failing to meet the requirement for low latency.

Reference:
AWS for the Edge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A company wants to improve developer productivity and streamline software development using generative AI. The company plans to use Amazon Q Developer.

What functionality does Amazon Q Developer offer to help meet these goals?

  1. Generate code snippets, track references, and manage open source licenses.
  2. Run applications without needing to provision or manage servers.
  3. Enable voice-activated coding and natural language search capabilities.
  4. Convert audio files into text documents using machine learning models.
A

1. Generate code snippets, track references, and manage open source licenses.

Amazon Q Developer is designed to assist developers by automating tasks like creating code snippets, tracking dependencies, and ensuring compliance with open-source licenses. This helps improve developer productivity by reducing manual effort.

  • Run applications without needing to provision or manage servers is incorrect because running applications without server management refers to AWS Lambda or serverless computing, not Amazon Q Developer.
  • Enable voice-activated coding and natural language search capabilities is incorrect because Amazon Q Developer does not focus on voice commands or natural language search features, which are more associated with tools like Amazon Alexa.
  • Convert audio files into text documents using machine learning models is incorrect because converting audio to text is typically done by Amazon Transcribe, not Amazon Q Developer, which is focused on developer tools and productivity.

Reference:
Amazon Q Developer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A media company has implemented a generative AI solution that uses large language models (LLMs) to automatically generate subtitles for video content in different languages. The company wants to assess the quality of the translations generated by the model.

Which model evaluation strategy should the company use?

  1. Root mean squared error (RMSE)
  2. Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
  3. Bilingual Evaluation Understudy (BLEU)
  4. F1 score
A

3. Bilingual Evaluation Understudy (BLEU)

BLEU is a widely used metric for evaluating the quality of machine-generated translations by comparing them to reference translations. It measures the accuracy of translation models, making it ideal for assessing subtitle generation.

  • Root mean squared error (RMSE) is incorrect because RMSE is used to measure the differences between predicted and observed values in regression models, not for evaluating text translation quality.
  • Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is incorrect because ROUGE is primarily used for evaluating text summarization, not translation accuracy.
  • F1 score is incorrect because the F1 score measures the balance between precision and recall in classification tasks, not for assessing translations in natural language processing.

Reference:
What is Natural Language Processing (NLP)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A travel company is using a pre-trained large language model (LLM) to create a chatbot that provides vacation suggestions. The company needs the chatbot’s responses to be concise and delivered in a specific language.

Which solution will help ensure the LLM produces responses that meet the company’s requirements?

  1. Adjust the prompt.
  2. Select an LLM with different architecture.
  3. Raise the temperature value.
  4. Increase the maximum number of tokens.
A

1. Adjust the prompt.

Adjusting the prompt allows the company to control the style, length, and language of the responses generated by the LLM. Providing clear instructions within the prompt helps guide the model to produce shorter responses and ensures they are written in the specified language.

  • Select an LLM with different architecture is incorrect because changing the architecture or size of the model does not directly control the length or language of its outputs. Prompt adjustments are a more effective solution for this need.
  • Raise the temperature value is incorrect because increasing the temperature makes the model’s responses more creative and diverse, which could lead to longer and less predictable outputs.
  • Increase the maximum number of tokens is incorrect because increasing the token limit allows for longer responses, which contradicts the company’s need for concise answers.

Reference:
Prompt Engineering Concepts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A fashion brand wants to use a pre-trained generative AI model to create content for its marketing campaigns. The company needs to ensure that the generated content reflects the brand’s tone and messaging.

Which solution will meet these requirements?

  1. Optimize the model’s architecture and hyperparameters to improve the model’s accuracy.
  2. Add more layers to the model to increase its ability to generate complex content.
  3. Craft detailed prompts with clear instructions to ensure the generated content aligns with the brand’s voice.
  4. Pre-train a new generative model using a massive dataset tailored to the brand.
A

3. Craft detailed prompts with clear instructions to ensure the generated content aligns with the brand’s voice.

Creating detailed prompts is the most efficient way to control the output of a pre-trained generative AI model. By specifying the tone, style, and context in the prompts, the company can guide the model to generate content that aligns with its brand messaging without needing to modify the model itself.

  • Optimize the model’s architecture and hyperparameters to improve the model’s accuracy is incorrect because optimizing the model’s performance does not directly ensure the content will match the brand’s tone. Clear prompts are a simpler and more effective way to control output.
  • Add more layers to the model to increase its ability to generate complex content is incorrect because adding layers increases the model’s complexity but does not guarantee that the generated content will align with the brand’s voice.
  • Pre-train a new generative model using a massive dataset tailored to the brand is incorrect because pre-training a new model is resource-intensive and unnecessary when clear prompts can easily guide the content generation to meet brand requirements.

Reference:
Prompt Engineering Concepts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A research organization wants to build an AI application using large language models (LLMs) to process scientific papers and automatically extract the main findings for easier review.

Which solution meets these requirements?

  1. Develop a summarization system.
  2. Build a text classification tool.
  3. Create a named entity recognition system.
  4. Implement a machine translation system.
A

1. Develop a summarization system.

A summarization system is designed to read large amounts of text and extract the key points, making it ideal for processing scientific papers and summarizing the main findings.

  • Build a text classification tool is incorrect because text classification organizes documents into predefined categories, which does not meet the requirement of summarizing key findings from scientific papers.
  • Create a named entity recognition system is incorrect because named entity recognition identifies specific entities such as names or dates, but does not summarize or extract key findings.
  • Implement a machine translation system is incorrect because machine translation converts text from one language to another but does not summarize documents.

Reference:
What is Natural Language Processing (NLP)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A healthcare company is building a virtual assistant using a large language model (LLM) to provide medical advice. The company wants to ensure that the LLM does not generate incorrect medical recommendations or misleading information when prompted by users.

Which action should the company take to mitigate these risks?

  1. Implement human-in-the-loop review for medical responses to ensure accuracy.
  2. Add more hidden layers to the LLM to improve accuracy in medical domains.
  3. Use a higher temperature setting to generate more diverse medical responses.
  4. Increase the token limit to allow for more detailed medical answers.
A

1. Implement human-in-the-loop review for medical responses to ensure accuracy.

Human-in-the-loop review ensures that sensitive medical responses generated by the LLM are validated by healthcare professionals. This approach adds a layer of oversight, helping to prevent the spread of incorrect or harmful medical advice.

  • Add more hidden layers to the LLM to improve accuracy in medical domains is incorrect because adding layers does not directly address the need for accurate medical advice. Human oversight is more effective in this context.
  • Use a higher temperature setting to generate more diverse medical responses is incorrect because increasing the temperature makes the responses more random and diverse, which could lead to less accurate medical information.
  • Increase the token limit to allow for more detailed medical answers is incorrect because increasing the token limit allows for longer responses but does not improve the accuracy or safety of the information provided.

Reference:
Create and Start a Human Loop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A company is using a large language model (LLM) to automate customer support. However, they are concerned that users could inject malicious instructions into prompts to manipulate the LLM into providing incorrect or unauthorized responses.

Which risk is the company trying to address?

  1. Model jailbreaking
  2. Prompt injection
  3. Data poisoning
  4. Overfitting
A

2. Prompt injection

Prompt injection refers to an attack where malicious users modify input prompts to manipulate the model into producing unintended, incorrect, or unauthorized outputs. This is a critical risk in systems that rely on prompt engineering for generating responses.

  • Model jailbreaking is incorrect because model jailbreaking involves bypassing restrictions or safeguards, but it’s not specifically about injecting malicious instructions into prompts.
  • Data poisoning is incorrect because data poisoning involves corrupting the training data to influence the model’s behavior, not manipulating input prompts.
  • Overfitting is incorrect because overfitting occurs when a model performs well on training data but poorly on new data, which is unrelated to prompt manipulation.

Reference:
Common Prompt Injection Attacks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A company is using AWS generative AI services to deploy a custom language model for customer support. The company wants to optimize costs while ensuring high availability and fast response times for users during peak hours.

Which factor should the company consider to manage these cost tradeoffs?

  1. Token-based pricing
  2. Data augmentation
  3. Use of on-demand EC2 instances
  4. Increased batch size for training
A

1. Token-based pricing

Token-based pricing is a key cost factor for generative AI services, where the cost is based on the number of tokens (words or characters) processed during inference. The company can manage costs by optimizing the number of tokens processed during peak hours, while maintaining high availability and responsiveness.

  • Data augmentation is incorrect because data augmentation increases the size of the training dataset, but it does not directly affect the cost tradeoffs for inference or responsiveness.
  • Use of on-demand EC2 instances is incorrect because on-demand EC2 pricing affects general compute costs, but token-based pricing is more relevant for AI model usage in this context.
  • Increased batch size for training is incorrect because batch size refers to training efficiency and does not directly affect the cost of inference during customer interactions.

Reference:
Amazon Bedrock Pricing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A gaming company is deploying a custom generative AI model on AWS to generate real-time game dialogues. The company needs high availability and low-latency responses for global users, but it also wants to control costs.

Which factor should the company consider to manage these cost tradeoffs?

  1. Provisioned throughput
  2. Token-based pricing
  3. Data labeling
  4. Hyperparameter tuning
A

1. Provisioned throughput

Provisioned throughput allows the company to allocate dedicated resources for inference, ensuring high availability and low-latency responses during peak times while maintaining control over costs. By provisioning the right amount of throughput, the company can balance performance with cost efficiency.

  • Token-based pricing is incorrect because while token-based pricing impacts cost, the scenario here focuses on performance tradeoffs, which are better managed with provisioned throughput.
  • Data labeling is incorrect because data labeling is a part of model training and preparation, not a factor in managing inference performance or cost tradeoffs.
  • Hyperparameter tuning is incorrect because hyperparameter tuning is used to improve model performance during training, not for managing cost and availability tradeoffs in production environments.

Reference:
Amazon Bedrock Pricing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly