Training language models to follow instructions with human feedback
- [2022]
- arxiv.org
Aligning Language Models to Follow Instructions
- 公式ブログ
- openai.com

Chat GPT / 2023

Instruct GPT の兄弟モデル.

Chat GPT
- yhayato1320.hatenablog.com

Contrastive Pre Training / CPT / 2022

Text and Code Embeddings by Contrastive Pre-Training
- [2022]
- arxiv.org

SpikeGPT / 2023

SNN を利用.

Spiking Neural Network
- yhayato1320.hatenablog.com
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
- [2023]
- arxiv.org

GPT-4

言語と画像のマルチモーダル大規模言語モデル.

GPT-4
- yhayato1320.hatenablog.com

HuggingGPT / 2023

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
- arxiv.org
「AI多すぎ、何使えばいいか分からない……」を解決するAI「HuggingGPT」　文章入力だけで、適切な機械学習モデルを自動選択
- www.itmedia.co.jp

FrugalGPT / 2023

コスト削減のための改善.

FrugalGPT
- yhayato1320.hatenablog.com

特定のドメインの分野への応用

Finance

BloombergGPT / 2023

金融情報サービス会社「Bloomberg」が金融に強い汎用言語モデルを目標に、 506億パラメータの「BloombergGPT」を5690億トークンで学習した.

金融テキスト(約3630億トークン)と一般テキスト(約3450億トークン)からなるデータセットを作成(合計約7000億トークン)

BloombergGPT: A Large Language Model for Finance
- [2023]
- arxiv.org

他モーダルへの応用

画像

Image GPT

画像の生成.

Image GPT
- yhayato1320.hatenablog.com

Mario GPT / 2023

MarioGPT: Open-Ended Text2Level Generation through Large Language Models
- [2023]
- arxiv.org

Seg GPT / 2023

SegGPT: Segmenting Everything In Context
- [2023]
- arxiv.org
- github.com
- huggingface.co

音響

AudioGPT / 2023

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
- [2023]
- arxiv.org

マルチモーダル

TagGPT / 2023

TagGPT: Large Language Models are Zero-shot Multimodal Taggers
- [2023]
- arxiv.org

Application / Service

GPT SAN

github.com
- github

GraphGPT

nanoGPT

github.com

picoGPT

github.com

Viper GPT

ViperGPT: Visual Inference via Python Execution for Reasoning
- [2023]
- arxiv.org
- viper.cs.columbia.edu

X-GPT

X-GPT: Connecting generalist X-Decoder with GPT-3
- github.com

Cerebras-GPT

オープンソースで最大130億パラメータの言語モデル「Cerebras-GPT」が発表.

chinchillaのスケーリング則を参考.

7つのサイズがある(パラメータ数: 111M、256M、590M、1.3B、2.7B、6.7B、13B).

オープンなデータセットを用いてスケーリング則を導出.

非GPUで実行.

Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems
- www.businesswire.com
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
- [2023]
- arxiv.org
オープンソースでGPTベースの大規模言語モデル「Cerebras-GPT」7種類が一気に誰でもダウンロード可能に
- gigazine.net
日本語が通る大規模言語モデルCerebras-GPTを動かす
- nowokay.hatenablog.com

Auto-GPT

Auto-GPT: An Autonomous GPT-4 Experiment
- github.com
「Auto-GPT」とは何か？次に来る強力なAIツールの基礎知識
- https://news.yahoo.co.jp/articles/c7b992c1ebe7778731baa5ac2c4b5e811c4be91bnews.yahoo.co.jp

Rinna

日本語に特化した13億パラメータのGPT言語モデルを公開
- rinna.co.jp
Google Colab で Rinna-3.6B を試す
- note.com
AIりんな開発元、日本語に特化した36億パラメータのGPT言語モデルを公開
- 0115765.com
rinnaが日本語特化LLM公開 36億パラメータ
- www.watch.impress.co.jp
Google Colab で Rinna-3.6B のLoRAファインチューニングを試す
- note.com
rinna、人間の評価を利用したGPT言語モデルの強化学習に成功
- rinna.co.jp

Multi-modal GPT
- github.com

評価方法

GPT SCORE / 2023

GPTScore: Evaluate as You Desire
- [2023]
- arxiv.org

GPTEval / 2023

GPTEval: NLG Evaluation using GPT-4 with Better Human Alignment
- [2023]
- arxiv.org

その他

How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation
- [2023]
- arxiv.org

参考

Web サイト

Model index for researchers
- beta.openai.com
- GPT モデルのアルゴリズムとモデル
What Meta’s Galactica missteps mean for GPT-4 | The AI Beat
- venturebeat.com

動画

Let's build GPT: from scratch, in code, spelled out.
www.youtube.com

Index

GPT

GPT-1 / 2018

GPT-2 / 2019

GPT-3 / 2020

GPT-J / 2021

Codex / 2021

GPT-3.5 Series / 2021

Instruct GPT / 2022