✍️ STEP 23: テキスト生成と要約

GPT-2による創造的な文章生成と、T5/BARTによる要約の実装

📋 このステップで学ぶこと

Autoregressive生成（自己回帰生成）の仕組み
生成戦略（Greedy、Beam Search、Top-k、Top-p、Temperature）
GPT-2でのテキスト生成
Abstractive要約（T5、BART）
評価指標（ROUGE、BLEU）

練習問題： 5問

⚠️ 実行環境について

このステップのコードはGoogle Colab（GPU必須）で実行してください。「ランタイム」→「ランタイムのタイプを変更」→「GPU」を選択します。

🎨 1. テキスト生成の基礎

テキスト生成は、与えられた入力（プロンプト）から続きの文章を自動的に作成するタスクです。 ChatGPTなどの対話AIの中核技術です。

1-1. Autoregressive生成とは

【Autoregressive（自己回帰）生成】

定義:
  前の単語を条件として、次の単語を1つずつ生成する

数式:
  P(文全体) = P(w₁) × P(w₂|w₁) × P(w₃|w₁,w₂) × …
  
  意味:
  ・w₁の確率 × w₂がw₁の後に来る確率 × w₃がw₁,w₂の後に来る確率…


【生成プロセスの例】

入力プロンプト: “Once upon a time”

ステップ1:
  入力: “Once upon a time”
  予測: 次の単語の確率を計算
        “there” (30%), “in” (20%), “the” (15%), …
  選択: “there”（最も確率が高い）
  出力: “Once upon a time there”

ステップ2:
  入力: “Once upon a time there”
  予測: “was” (40%), “lived” (25%), …
  選択: “was”
  出力: “Once upon a time there was”

ステップ3:
  入力: “Once upon a time there was”
  予測: “a” (35%), “an” (20%), …
  選択: “a”
  出力: “Once upon a time there was a”

… （繰り返し）

終了条件:
  ・<EOS>（End of Sentence）トークンが生成される
  ・指定した最大長に達する


【図解】

“Once upon a time” → [GPT] → “there” (0.30)
                              “in”    (0.20)
                              “the”   (0.15)
                              …
                     
                     最も確率が高い “there” を選択
                     ↓
“Once upon a time there” → [GPT] → “was”   (0.40)
                                   “lived” (0.25)
                                   …

        💡 Autoregressive生成のポイント
        1単語ずつ生成（逐次的）
前の全ての単語を条件として次を予測
生成が終わるまで繰り返し
GPT系モデルはこの方式を採用

    

1-2. 生成戦略の比較

「次の単語をどう選ぶか」には複数の戦略があります。それぞれ特徴が異なります。

戦略	仕組み	特徴
Greedy Search	常に最も確率が高い単語を選択	高速だが単調、繰り返しが多い
Beam Search	上位k個の候補を保持して探索	品質は上がるが依然として単調
Top-k Sampling	上位k個からランダムに選択	多様性が増す
Top-p Sampling	累積確率がpになるまでの候補から選択	動的に候補数を調整（推奨）
Temperature	確率分布の「尖り具合」を調整	創造性を制御

1-3. 各戦略の詳細

【1. Greedy Search（貪欲探索）】

方法: 毎回、最も確率が高い単語を選ぶ

例:
  確率: “the” (0.4), “a” (0.3), “an” (0.2), …
  選択: “the”（常に最大）

利点: 
  ✅ 決定論的（同じ入力なら同じ出力）
  ✅ 高速
  
欠点:
  ❌ 単調な文章
  ❌ 同じフレーズの繰り返し
  
問題例:
  “The cat sat on the mat. The cat sat on the mat. The cat…”
  → 同じパターンの無限ループ


【2. Beam Search】

方法: 上位k個の「候補系列」を保持して探索

パラメータ: beam_size（通常3-10）

例（beam_size=2）:
  ステップ1: “I like” (0.3), “I love” (0.25) を保持
  ステップ2: 
    “I like cats” (0.15), “I like dogs” (0.12)
    “I love music” (0.14), “I love food” (0.10)
  → 上位2つ “I like cats”, “I love music” を保持

利点:
  ✅ Greedyより良い系列を見つけやすい
  
欠点:
  ❌ 依然として単調
  ❌ 計算コスト増加


【3. Top-k Sampling】

方法: 確率上位k個からランダムにサンプリング

パラメータ: k（通常30-50）

例（k=3）:
  確率: “the” (0.4), “a” (0.3), “an” (0.2), “some” (0.05), …
  候補: “the”, “a”, “an” の3つのみ
  選択: この3つからランダム（確率に比例）

利点:
  ✅ 多様性が増す
  
欠点:
  ❌ kの値の選択が難しい
  ❌ 文脈によっては不適切


【4. Top-p (Nucleus) Sampling】← 推奨

方法: 累積確率がpになるまでの候補からサンプリング

パラメータ: p（通常0.9-0.95）

例（p=0.9）:
  確率: “the” (0.4), “a” (0.3), “an” (0.15), “some” (0.1), …
  累積: 0.4 → 0.7 → 0.85 → 0.95
  候補: “the”, “a”, “an”, “some”（累積が0.9を超えるまで）
  選択: この候補からランダム

利点:
  ✅ 文脈に応じて候補数が自動調整
  ✅ 多様かつ自然な生成
  
実務では Top-p が最も推奨されます


【5. Temperature】

方法: 確率分布の「尖り具合」を調整

パラメータ: T（Temperature）

計算:
  新しい確率 = softmax(logits / T)

効果:
  T < 1.0: 分布が尖る → 確信的、保守的
  T = 1.0: 元の分布のまま
  T > 1.0: 分布が平らに → ランダム、創造的

例:
  元の確率: “the” (0.5), “a” (0.3), “an” (0.2)
  
  T=0.5: “the” (0.7), “a” (0.2), “an” (0.1)  ← 尖る
  T=1.5: “the” (0.4), “a” (0.35), “an” (0.25) ← 平らに

💡 推奨設定

創造的な文章: temperature=0.8, top_p=0.9
事実的な文章: temperature=0.5, top_p=0.95
対話: temperature=0.7, top_p=0.9

🤖 2. GPT-2でのテキスト生成

GPT-2はOpenAIが開発した言語モデルです。 Hugging Faceから簡単に利用できます。

2-1. 環境構築

※モバイルでは横スクロールできます

# ========================================
# 必要なライブラリのインストール
# ========================================

!pip install transformers==4.35.0 -q

# ========================================
# ライブラリのインポート
# ========================================

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# GPU確認
device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
print(f”Using device: {device}”)

実行結果：

Using device: cuda

2-2. GPT-2モデルの読み込み

# ========================================
# GPT-2モデルの読み込み
# ========================================

# モデル名（サイズ違いがある）
# ‘gpt2’: 124M パラメータ（小）
# ‘gpt2-medium’: 355M パラメータ（中）
# ‘gpt2-large’: 774M パラメータ（大）
# ‘gpt2-xl’: 1.5B パラメータ（特大）
model_name = ‘gpt2’

# トークナイザーの読み込み
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

# モデルの読み込み
# GPT2LMHeadModel: 言語モデル（次の単語を予測）
model = GPT2LMHeadModel.from_pretrained(model_name)

# GPUに転送
model = model.to(device)

# 評価モードに設定（推論時はDropoutをオフ）
model.eval()

print(f”モデル: {model_name}”)
print(f”パラメータ数: {sum(p.numel() for p in model.parameters()):,}”)

実行結果：

モデル: gpt2 パラメータ数: 124,439,808

2-3. 基本的なテキスト生成

# ========================================
# 基本的なテキスト生成
# ========================================

# 入力プロンプト
prompt = “Once upon a time, in a kingdom far away,”

# トークン化
# return_tensors=’pt’: PyTorchテンソルで返す
input_ids = tokenizer.encode(prompt, return_tensors=’pt’)

# GPUに転送
input_ids = input_ids.to(device)

print(f”入力プロンプト: {prompt}”)
print(f”トークン数: {input_ids.shape[1]}”)

実行結果：

入力プロンプト: Once upon a time, in a kingdom far away, トークン数: 11

# ========================================
# Greedy Search で生成
# ========================================

# model.generate(): テキスト生成
# max_length: 生成する最大トークン数
# do_sample=False: サンプリングなし（Greedy）
output = model.generate(
    input_ids,
    max_length=60,
    do_sample=False,           # Greedy Search
    pad_token_id=tokenizer.eos_token_id
)

# デコード（トークン → テキスト）
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(“=== Greedy Search ===”)
print(generated_text)

実行結果：

=== Greedy Search === Once upon a time, in a kingdom far away, there was a young man who was very good at his job. He was a very good worker. He was a very good worker. He was a very good worker. He was

⚠️ Greedy Searchの問題点

「He was a very good worker.」が繰り返されています。これがGreedy Searchの欠点です。

2-4. Top-p Sampling で生成

# ========================================
# Top-p Sampling で生成
# ========================================

# do_sample=True: サンプリングを有効化
# top_p=0.9: 累積確率90%までの候補から選択
# temperature=0.7: やや創造的に
output = model.generate(
    input_ids,
    max_length=80,
    do_sample=True,            # サンプリング有効
    top_p=0.9,                 # Top-p Sampling
    temperature=0.7,           # 創造性の調整
    pad_token_id=tokenizer.eos_token_id
)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(“=== Top-p Sampling ===”)
print(generated_text)

実行結果（例）：

=== Top-p Sampling === Once upon a time, in a kingdom far away, there lived a brave knight named Sir Arthur. He embarked on a quest to find the legendary sword that was hidden deep in the enchanted forest. Along the way, he encountered mystical creatures and faced many challenges.

✅ Top-p Samplingの効果

繰り返しがなくなり、より創造的で自然な文章が生成されています。実用的なテキスト生成では Top-p Sampling が推奨されます。

2-5. Temperature の比較

# ========================================
# Temperature の比較
# ========================================

prompt = “The future of artificial intelligence is”
input_ids = tokenizer.encode(prompt, return_tensors=’pt’).to(device)

temperatures = [0.3, 0.7, 1.0, 1.5]

print(“=== Temperature 比較 ===\n”)

for temp in temperatures:
    output = model.generate(
        input_ids,
        max_length=50,
        do_sample=True,
        top_p=0.9,
        temperature=temp,
        pad_token_id=tokenizer.eos_token_id
    )
    
    text = tokenizer.decode(output[0], skip_special_tokens=True)
    print(f”Temperature = {temp}:”)
    print(f”  {text}”)
    print()

実行結果（例）：

=== Temperature 比較 === Temperature = 0.3: The future of artificial intelligence is likely to be dominated by machine learning and deep learning algorithms. These technologies are expected to revolutionize various industries. Temperature = 0.7: The future of artificial intelligence is exciting and full of possibilities. We can expect AI to play an increasingly important role in our daily lives. Temperature = 1.0: The future of artificial intelligence is uncertain yet fascinating. Some experts predict rapid advancements while others remain cautious about potential risks. Temperature = 1.5: The future of artificial intelligence is quantum raspberry cosmic synthesizers dancing through nebulous datastreams while telepathic algorithms whisper secrets… （← 高すぎて意味不明に）

【Temperature の解釈】

T = 0.3（低い）:
  ・確信的で保守的
  ・予測可能な文章
  ・事実的な内容向き

T = 0.7（中程度）:
  ・バランスが良い
  ・多くの用途に適する
  ・推奨設定

T = 1.0（元の分布）:
  ・モデル本来の確率
  ・やや創造的

T = 1.5（高い）:
  ・非常にランダム
  ・意味不明になりがち
  ・通常は使わない

2-6. 複数の候補を生成

# ========================================
# 複数の候補を生成
# ========================================

prompt = “Once upon a time, in a kingdom far away,”
input_ids = tokenizer.encode(prompt, return_tensors=’pt’).to(device)

# num_return_sequences: 生成する候補の数
output = model.generate(
    input_ids,
    max_length=60,
    do_sample=True,
    top_p=0.9,
    temperature=0.8,
    num_return_sequences=3,    # 3つの候補を生成
    pad_token_id=tokenizer.eos_token_id
)

print(“=== 複数候補の生成 ===\n”)
for i, seq in enumerate(output, 1):
    text = tokenizer.decode(seq, skip_special_tokens=True)
    print(f”候補 {i}:”)
    print(f”  {text}”)
    print()

実行結果（例）：

=== 複数候補の生成 === 候補 1: Once upon a time, in a kingdom far away, there was a magical forest where fairies danced under the moonlight… 候補 2: Once upon a time, in a kingdom far away, ruled a wise queen who made decisions that brought prosperity to her people… 候補 3: Once upon a time, in a kingdom far away, lived a curious child who dreamed of exploring the world beyond the mountains…

📝 3. テキスト要約

テキスト要約は、長い文章を短くまとめるタスクです。 2つのアプローチがあります。

3-1. Extractive vs Abstractive

【2つの要約アプローチ】

■ Extractive（抽出型）要約
  元の文から重要な文を「抽出」する
  
  例:
  元の文章:
    “AIは急速に発展しています。機械学習は AI の一分野です。
     ディープラーニングは画像認識に革命をもたらしました。
     今後も発展が期待されています。”
  
  抽出型要約:
    “AIは急速に発展しています。
     ディープラーニングは画像認識に革命をもたらしました。”
    ↑ 元の文をそのまま選択
  
  特徴:
  ✅ 文法的に正しい（元の文を使用）
  ✅ 事実性が保証される
  ❌ 冗長になりがち
  ❌ 文の選択に限定される


■ Abstractive（生成型）要約 ← このステップで学ぶ
  内容を理解して新しい文を「生成」する
  
  例:
  元の文章: （上と同じ）
  
  生成型要約:
    “AI技術、特にディープラーニングは急速に発展し、
     今後の進歩が期待されています。”
    ↑ 新しい表現で要約
  
  特徴:
  ✅ 簡潔で自然な要約
  ✅ 言い換えや統合が可能
  ❌ ハルシネーション（事実誤認）のリスク
  ❌ 計算コストが高い

3-2. T5による要約

T5（Text-to-Text Transfer Transformer）は、様々なタスクを「テキスト→テキスト」の形式で処理するモデルです。

# ========================================
# T5モデルの読み込み
# ========================================

from transformers import T5Tokenizer, T5ForConditionalGeneration

# T5モデル
# t5-small: 60M（小）
# t5-base: 220M（中）← 今回使用
# t5-large: 770M（大）
model_name_t5 = ‘t5-base’

tokenizer_t5 = T5Tokenizer.from_pretrained(model_name_t5)
model_t5 = T5ForConditionalGeneration.from_pretrained(model_name_t5)

model_t5 = model_t5.to(device)
model_t5.eval()

print(f”T5モデル: {model_name_t5}”)
print(f”パラメータ数: {sum(p.numel() for p in model_t5.parameters()):,}”)

実行結果：

T5モデル: t5-base パラメータ数: 222,903,552

3-3. T5による要約の実行

# ========================================
# 要約する記事
# ========================================

article = “””
The Amazon rainforest, also known as Amazonia, is a moist broadleaf 
tropical rainforest in the Amazon biome that covers most of the Amazon 
basin of South America. This basin encompasses 7 million square kilometers, 
of which 5.5 million square kilometers are covered by the rainforest. 
The majority of the forest is contained within Brazil, with 60% of the 
rainforest, followed by Peru with 13%, Colombia with 10%, and with minor 
amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname, and French Guiana. 
The Amazon represents over half of the planet’s remaining rainforests, 
and comprises the largest and most biodiverse tract of tropical rainforest 
in the world.
“””

print(“=== 元の記事 ===”)
print(article)
print(f”文字数: {len(article)}”)

実行結果：

=== 元の記事 === The Amazon rainforest, also known as Amazonia, is a moist broadleaf… 文字数: 723

# ========================================
# T5で要約を生成
# ========================================

# T5の入力形式: “summarize: ” + テキスト
input_text = f”summarize: {article}”

# トークン化
inputs = tokenizer_t5(
    input_text,
    return_tensors=’pt’,
    max_length=512,        # 入力の最大長
    truncation=True        # 長い場合は切り詰め
)
inputs = {k: v.to(device) for k, v in inputs.items()}

# 要約生成
with torch.no_grad():
    outputs = model_t5.generate(
        **inputs,
        max_length=150,    # 出力の最大長
        min_length=50,     # 出力の最小長
        length_penalty=2.0, # 長さへのペナルティ（大きいと短くなる）
        num_beams=4,       # Beam Search
        early_stopping=True
    )

# デコード
summary = tokenizer_t5.decode(outputs[0], skip_special_tokens=True)

print(“=== T5 要約 ===”)
print(summary)
print(f”文字数: {len(summary)}”)

実行結果：

=== T5 要約 === the Amazon rainforest is a tropical rainforest in the Amazon biome. it covers most of the Amazon basin of South America. the majority of the forest is in Brazil. the Amazon represents over half of the planet’s remaining rainforests. 文字数: 234

【T5要約のポイント】

1. 入力形式
   “summarize: ” + テキスト
   → T5はこのプレフィックスでタスクを認識

2. 主要なパラメータ
   max_length: 要約の最大トークン数
   min_length: 要約の最小トークン数
   length_penalty: 長さへのペナルティ
   num_beams: Beam Searchの幅

3. 結果
   元の記事: 723文字
   要約: 234文字
   → 約32%に圧縮！

3-4. 要約の長さを調整

# ========================================
# 要約の長さを調整
# ========================================

def summarize_with_length(text, max_len, min_len):
    “””
    指定した長さで要約を生成する関数
    “””
    input_text = f”summarize: {text}”
    inputs = tokenizer_t5(
        input_text,
        return_tensors=’pt’,
        max_length=512,
        truncation=True
    )
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model_t5.generate(
            **inputs,
            max_length=max_len,
            min_length=min_len,
            num_beams=4,
            early_stopping=True
        )
    
    return tokenizer_t5.decode(outputs[0], skip_special_tokens=True)

# 短い要約
print(“=== 短い要約（30-50トークン） ===”)
short_summary = summarize_with_length(article, max_len=50, min_len=30)
print(short_summary)

# 長い要約
print(“\n=== 長い要約（100-150トークン） ===”)
long_summary = summarize_with_length(article, max_len=150, min_len=100)
print(long_summary)

実行結果：

=== 短い要約（30-50トークン） === the Amazon rainforest covers most of the Amazon basin. it is the largest tropical rainforest in the world. === 長い要約（100-150トークン） === the Amazon rainforest, also known as Amazonia, is a moist broadleaf tropical rainforest in the Amazon biome. the basin encompasses 7 million square kilometers. the majority of the forest is contained within Brazil, with 60% of the rainforest. the Amazon represents over half of the planet’s remaining rainforests.

🚀 4. Hugging Face Pipelineの活用

Hugging Face Pipelineを使うと、より簡単に要約ができます。

4-1. 要約パイプライン

# ========================================
# 要約パイプラインの使用
# ========================================

from transformers import pipeline

# 要約パイプラインの作成
# model: 使用するモデル（CNNデータセットでファインチューニング済み）
summarizer = pipeline(
    ‘summarization’,
    model=’facebook/bart-large-cnn’,
    device=0 if torch.cuda.is_available() else -1
)

print(“要約パイプラインを読み込みました”)

実行結果：

要約パイプラインを読み込みました

# ========================================
# パイプラインで要約
# ========================================

# 要約の実行
summary = summarizer(
    article,
    max_length=130,
    min_length=30,
    do_sample=False
)

print(“=== BART 要約 ===”)
print(summary[0][‘summary_text’])

実行結果：

=== BART 要約 === The Amazon rainforest covers most of the Amazon basin of South America. The basin encompasses 7 million square kilometers. The majority of the forest is in Brazil, followed by Peru, Colombia, and Venezuela.

        💡 Pipelineのメリット
        簡単: 数行のコードで要約が完成
高品質: ファインチューニング済みモデルを使用
BART: ニュース要約に特化したモデル

    

📊 5. 評価指標

テキスト生成と要約の品質を測定する指標を学びます。

5-1. ROUGE Score

【ROUGE（Recall-Oriented Understudy for Gisting Evaluation）】

定義:
  生成された要約と参照要約の「重なり」を測定

3つの種類:
  ROUGE-1: 単語（unigram）の重なり
  ROUGE-2: 2単語の組（bigram）の重なり
  ROUGE-L: 最長共通部分列（LCS）


【ROUGE-1 の計算例】

参照: “The cat sat on the mat”
生成: “The cat is on the mat”

参照の単語: {The, cat, sat, on, the, mat} → 6個
生成の単語: {The, cat, is, on, the, mat} → 6個
共通の単語: {The, cat, on, the, mat} → 5個

Precision = 共通 / 生成 = 5/6 = 0.833
Recall = 共通 / 参照 = 5/6 = 0.833
F1 = 2 × P × R / (P + R) = 0.833


【ROUGE-2 の計算例】

参照のbigram: {The cat, cat sat, sat on, on the, the mat}
生成のbigram: {The cat, cat is, is on, on the, the mat}
共通: {The cat, on the, the mat} → 3個

Precision = 3/5 = 0.60
Recall = 3/5 = 0.60
F1 = 0.60


【ROUGE-L】

最長共通部分列（LCS）を使用
  参照: “The cat sat on the mat”
  生成: “The cat is on the mat”
  
  LCS: “The cat on the mat”（5単語）
  
ROUGE-L は語順を考慮しつつ、連続でなくてもOK

5-2. ROUGEの計算

# ========================================
# ROUGEのインストールと計算
# ========================================

!pip install rouge-score -q

from rouge_score import rouge_scorer

# スコアラーの作成
scorer = rouge_scorer.RougeScorer(
    [‘rouge1’, ‘rouge2’, ‘rougeL’],
    use_stemmer=True  # 語幹処理を使用
)

# ========================================
# ROUGEスコアの計算
# ========================================

# 参照要約（人間が書いた正解）
reference = “””
The Amazon rainforest is a tropical rainforest covering most of 
the Amazon basin in South America. It is the largest rainforest 
in the world and contains incredible biodiversity.
“””

# 生成要約（モデルが出力）
generated = “””
The Amazon rainforest covers the Amazon basin of South America. 
It is the world’s largest rainforest with remarkable biodiversity.
“””

# スコア計算
scores = scorer.score(reference, generated)

print(“=== ROUGE スコア ===\n”)
for metric, score in scores.items():
    print(f”{metric.upper()}:”)
    print(f”  Precision: {score.precision:.4f}”)
    print(f”  Recall:    {score.recall:.4f}”)
    print(f”  F1:        {score.fmeasure:.4f}”)
    print()

実行結果：

=== ROUGE スコア === ROUGE1: Precision: 0.7619 Recall: 0.6957 F1: 0.7273 ROUGE2: Precision: 0.5000 Recall: 0.4286 F1: 0.4615 ROUGEL: Precision: 0.6667 Recall: 0.6087 F1: 0.6364

【スコアの解釈】

ROUGE-1 F1 = 0.73
  単語レベルで73%の重なり
  → 良好なスコア

ROUGE-2 F1 = 0.46
  bigramレベルで46%の重なり
  → 語順も部分的に一致

ROUGE-L F1 = 0.64
  最長共通部分列で64%
  → 文の構造も類似


【ROUGEスコアの目安】

・ROUGE-1 F1 > 0.4: 許容範囲
・ROUGE-1 F1 > 0.5: 良好
・ROUGE-1 F1 > 0.6: 優秀

※ タスクやデータセットによって基準は異なる

5-3. BLEU Score

【BLEU（Bilingual Evaluation Understudy）】

定義:
  機械翻訳の評価指標として開発
  テキスト生成の評価にも使用

計算:
  n-gramの一致率を計算
  短い生成に対するペナルティ（Brevity Penalty）


【ROUGEとBLEUの違い】

ROUGE: Recall重視（参照の何%をカバーしたか）
BLEU: Precision重視（生成の何%が正しいか）

用途:
  ROUGE: 要約の評価に多用
  BLEU: 翻訳の評価に多用

# ========================================
# BLEUスコアの計算
# ========================================

!pip install sacrebleu -q

from sacrebleu.metrics import BLEU

# BLEUスコアラー
bleu = BLEU()

# 参照と生成
reference = [“The Amazon rainforest is the largest tropical rainforest.”]
generated = [“The Amazon is the world’s largest rainforest.”]

# スコア計算
score = bleu.corpus_score(generated, [reference])

print(“=== BLEU スコア ===”)
print(f”BLEU: {score.score:.2f}”)

実行結果：

=== BLEU スコア === BLEU: 42.58

指標	重視	主な用途
ROUGE	Recall（カバー率）	要約の評価
BLEU	Precision（正確性）	翻訳の評価

📝 練習問題

問題1：生成戦略

最も確率が高い単語を常に選択する生成戦略は？

Beam Search
Top-k Sampling
Greedy Search
Top-p Sampling

正解：c（Greedy Search）

Greedy Search（貪欲探索）は常に最も確率が高い単語を選択します。

特徴：

決定論的（同じ入力なら同じ出力）
高速
欠点: 単調、繰り返しが多い

問題2：Temperature

Temperatureを高くするとどうなりますか？

より確信的で保守的な生成
より決定論的な生成
よりランダムで創造的な生成
生成速度が速くなる

正解：c（よりランダムで創造的な生成）

Temperatureを高くすると確率分布が平らになり、低確率の単語も選ばれやすくなります。

効果：

T < 1.0: 確信的、保守的
T = 1.0: 元の分布
T > 1.0: ランダム、創造的

問題3：要約の種類

元の文から重要な文を抽出する要約方法は？

Abstractive Summarization
Extractive Summarization
Hybrid Summarization
Generative Summarization

正解：b（Extractive Summarization）

Extractive Summarization（抽出型要約）は元の文から重要な文を抽出します。

比較：

Extractive: 元の文をそのまま使用
Abstractive: 新しい文を生成（T5、BART等）

問題4：ROUGE指標

ROUGE-2が評価するものは？

単語（unigram）の重なり
2単語の組（bigram）の重なり
最長共通部分列
文の数

正解：b（2単語の組の重なり）

ROUGE-2はbigram（2単語の組）の重なりを評価します。

ROUGE指標：

ROUGE-1: Unigram（単語）
ROUGE-2: Bigram（2単語の組）
ROUGE-L: 最長共通部分列

問題5：Autoregressive生成

Autoregressive生成の欠点は？

流暢で自然な文章が生成できる
前の文脈を考慮できる
計算コストが高く、エラーが伝播する
実装が簡単

正解：c（計算コストが高く、エラーが伝播する）

欠点：

計算コストが高い: 1単語ずつ逐次的に生成
エラーの伝播: 最初の誤りが後続に影響

利点：

流暢で自然な文章
前の文脈を全て考慮

📝 STEP 23 のまとめ

✅ このステップで学んだこと

Autoregressive生成: 1単語ずつ逐次的に生成
生成戦略: Greedy、Beam Search、Top-k、Top-p、Temperature
GPT-2: 創造的なテキスト生成の実装
Abstractive要約: T5、BARTでの生成型要約
評価指標: ROUGE（要約）、BLEU（翻訳）

🎉 Part 6完了！応用タスクを完全マスター！

おめでとうございます！実践的なNLPタスクの実装を習得しました！

習得したスキル：

感情分析・テキスト分類（STEP 20）
固有表現認識（STEP 21）
質問応答（STEP 22）
テキスト生成と要約（STEP 23）

🎯 次のステップの準備

Part 7: 総合プロジェクト（STEP 24-25）では、学んだ全ての技術を統合したプロジェクトに挑戦します！

STEP 24: 日本語ニュース分類システムの構築
STEP 25: 独立プロジェクト（Twitter感情分析など）

📝

学習メモ

自然言語処理（NLP） - Step 23

📋 過去のメモ一覧 ▼