0% found this document useful (0 votes)
106 views94 pages

Deepseek Docs

The DeepSeek API provides a platform compatible with OpenAI's API format, allowing users to access various models including the upgraded DeepSeek-V3 and the reasoning model DeepSeek-R1. Users can obtain an API key and utilize example scripts in different programming languages to invoke the chat API. Pricing is based on token usage, with specific costs outlined for input and output tokens for each model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views94 pages

Deepseek Docs

The DeepSeek API provides a platform compatible with OpenAI's API format, allowing users to access various models including the upgraded DeepSeek-V3 and the reasoning model DeepSeek-R1. Users can obtain an API key and utilize example scripts in different programming languages to invoke the chat API. Pricing is based on token usage, with specific costs outlined for input and output tokens for each model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 94

https://api-docs.deepseek.

com/

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartYour First
API CallOn this pageYour First API Call
The DeepSeek API uses an API format compatible with OpenAI. By modifying the
configuration, you can use the OpenAI SDK or softwares compatible with the OpenAI
API to access the DeepSeek API.
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* To be compatible with OpenAI, you can also use https://api.deepseek.com/v1 as the
base_url. But note that the v1 here has NO relationship with the model's version.*
The deepseek-chat model has been upgraded to DeepSeek-V3. The API remains
unchanged. You can invoke DeepSeek-V3 by specifying model='deepseek-chat'.*
deepseek-reasoner is the latest reasoning model, DeepSeek-R1, released by DeepSeek.
You can invoke DeepSeek-R1 by specifying model='deepseek-reasoner'.
Invoke The Chat API
Once you have obtained an API key, you can access the DeepSeek API using the
following example scripts. This is a non-stream example, you can set the stream
parameter to true to get stream response.

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
NextModels & PricingInvoke The Chat APIWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartYour First
API CallOn this pageYour First API Call
The DeepSeek API uses an API format compatible with OpenAI. By modifying the
configuration, you can use the OpenAI SDK or softwares compatible with the OpenAI
API to access the DeepSeek API.
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* To be compatible with OpenAI, you can also use https://api.deepseek.com/v1 as the
base_url. But note that the v1 here has NO relationship with the model's version.*
The deepseek-chat model has been upgraded to DeepSeek-V3. The API remains
unchanged. You can invoke DeepSeek-V3 by specifying model='deepseek-chat'.*
deepseek-reasoner is the latest reasoning model, DeepSeek-R1, released by DeepSeek.
You can invoke DeepSeek-R1 by specifying model='deepseek-reasoner'.
Invoke The Chat API
Once you have obtained an API key, you can access the DeepSeek API using the
following example scripts. This is a non-stream example, you can set the stream
parameter to true to get stream response.

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
NextModels & PricingInvoke The Chat APIWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartYour First
API CallOn this pageYour First API Call
The DeepSeek API uses an API format compatible with OpenAI. By modifying the
configuration, you can use the OpenAI SDK or softwares compatible with the OpenAI
API to access the DeepSeek API.
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* To be compatible with OpenAI, you can also use https://api.deepseek.com/v1 as the
base_url. But note that the v1 here has NO relationship with the model's version.*
The deepseek-chat model has been upgraded to DeepSeek-V3. The API remains
unchanged. You can invoke DeepSeek-V3 by specifying model='deepseek-chat'.*
deepseek-reasoner is the latest reasoning model, DeepSeek-R1, released by DeepSeek.
You can invoke DeepSeek-R1 by specifying model='deepseek-reasoner'.
Invoke The Chat API
Once you have obtained an API key, you can access the DeepSeek API using the
following example scripts. This is a non-stream example, you can set the stream
parameter to true to get stream response.

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
NextModels & PricingInvoke The Chat APIWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始首次调用 API 本页总览首次调用 API
DeepSeek API 使用与 OpenAI 兼容的 API 格式,通过修改配置,您可以使用 OpenAI SDK 来访问 DeepSeek API,或使用与
OpenAI API 兼容的软件。
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* 出于与 OpenAI 兼容考虑,您也可以将 base_url 设置为 https://api.deepseek.com/v1 来使用,但注意,此处 v1 与模型版
本无关。* deepseek-chat 模型已全面升级为 DeepSeek-V3,接口不变。 通过指定 model='deepseek-chat' 即可调用
DeepSeek-V3。* deepseek-reasoner 是 DeepSeek 最新推出的推理模型 DeepSeek-R1。通过指定
model='deepseek-reasoner',即可调用 DeepSeek-R1。
调用对话 API
在创建 API key 之后,你可以使用以下样例脚本的来访问 DeepSeek API。样例为非流式输出,您可以将 stream 设置为 true 来使用流式输出。
curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:
application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
下一页模型 & 价格调用对话 API 微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/pricing

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartModels &
PricingOn this pageModels & Pricing
The prices listed below are in unites of per 1M tokens. A token, the smallest unit
of text that the model recognizes, can be a word, a number, or even a punctuation
mark. We will bill based on the total number of input and output tokens by the
model.
Pricing Details

USDCNYMODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX OUTPUT


TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K$0.07(5)$0.014$0.27(5)$0.14$1.10(5)$0.28deepseek-
reasoner64K32K8K$0.14$0.55$2.19 (6)MODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX
OUTPUT
TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K¥0.5(5)¥0.1¥2(5)¥1¥8(5)¥2deepseek-
reasoner64K32K8K¥1¥4¥16 (6)

(1) The deepseek-chat model has been upgraded to DeepSeek-V3. deepseek-reasoner


points to the new model DeepSeek-R1.
(2) CoT (Chain of Thought) is the reasoning content deepseek-reasoner gives before
output the final answer. For details, please refer to Reasoning Model。
(3) If max_tokens is not specified, the default maximum output length is 4K. Please
adjust max_tokens to support longer outputs.
(4) Please check DeepSeek Context Caching for the details of Context Caching.
(5) The form shows the the original price and the discounted price. From now until
2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API.
After that, it will recover to full price. DeepSeek-R1 is not included in the
discount.
(6) The output token count of deepseek-reasoner includes all tokens from CoT and
the final answer, and they are priced equally.

Deduction Rules
The expense = number of tokens × price.
The corresponding fees will be directly deducted from your topped-up balance or
granted balance, with a preference for using the granted balance first when both
balances are available.
Product prices may vary and DeepSeek reserves the right to adjust them. We
recommend topping up based on your actual usage and regularly checking this page
for the most recent pricing information.PreviousYour First API CallNextThe
Temperature ParameterPricing DetailsDeduction RulesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/parameter_settings

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartThe
Temperature ParameterThe Temperature Parameter
The default value of temperature is 1.0.

We recommend users to set the temperature according to their use case listed in
below.

USE CASETEMPERATURECoding / Math 0.0Data Cleaning / Data Analysis1.0General


Conversation1.3Translation1.3Creative Writing / Poetry1.5PreviousModels &
PricingNextToken & Token UsageWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/token_usage

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartToken &
Token UsageOn this pageToken & Token Usage
Tokens are the basic units used by models to represent natural language text, and
also the units we use for billing. They can be intuitively understood as
'characters' or 'words'. Typically, a Chinese word, an English word, a number, or a
symbol is counted as a token.
Generally, the conversion ratio between tokens in the model and the number of
characters is approximately as following:

1 English character ≈ 0.3 token.


1 Chinese character ≈ 0.6 token.

However, due to the different tokenization methods used by different models, the
conversion ratios can vary. The actual number of tokens processed each time is
based on the model's return, which you can view from the usage results.
Calculate token usage offline
You can run the demo tokenizer code in the following zip package to calculate the
token usage for your intput/output.
deepseek_v3_tokenizer.zipPreviousThe Temperature ParameterNextRate LimitCalculate
token usage offlineWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/rate_limit

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartRate
LimitRate Limit
DeepSeek API does NOT constrain user's rate limit. We will try out best to serve
every request.
However, please note that when our servers are under high traffic pressure, your
requests may take some time to receive a response from the server. During this
period, your HTTP request will remain connected, and you may continuously receive
contents in the following formats:

Non-streaming requests: Continuously return empty lines


Streaming requests: Continuously return SSE keep-alive comments (: keep-alive)

These contents do not affect the parsing of the JSON body by the OpenAI SDK. If you
are parsing the HTTP responses yourself, please ensure to handle these empty lines
or comments appropriately.
If the request is still not completed after 30 minutes, the server will close the
connection.PreviousToken & Token UsageNextError CodesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/error_codes

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartError
CodesError Codes
When calling DeepSeek API, you may encounter errors. Here list the causes and
solutions.
CODE DESCRIPTION400 - Invalid FormatCause:
Invalid request body format. Solution: Please modify your request body according
to the hints in the error message. For more API format details, please refer to
DeepSeek API Docs.401 - Authentication FailsCause: Authentication fails due to the
wrong API key. Solution: Please check your API key. If you don't have one, please
create an API key first.402 - Insufficient BalanceCause: You have run out of
balance. Solution: Please check your account's balance, and go to the Top up page
to add funds.422 - Invalid ParametersCause: Your request contains invalid
parameters. Solution: Please modify your request parameters according to the hints
in the error message. For more API format details, please refer to DeepSeek API
Docs.429 - Rate Limit ReachedCause: You are sending requests too quickly.
Solution: Please pace your requests reasonably. We also advise users to temporarily
switch to the APIs of alternative LLM service providers, like
OpenAI.500 - Server ErrorCause: Our server encounters an issue. Solution: Please
retry your request after a brief wait and contact us if the issue
persists.503 - Server OverloadedCause: The server is overloaded due to high
traffic. Solution: Please retry your request after a brief wait.PreviousRate
LimitNextDeepSeek-R1 ReleaseWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250120

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1
Release 2025/01/20DeepSeek-R1 Release

⚡ Performance on par with OpenAI-o1

📖 Fully open-source model & technical report

🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!

🔥 Bonus: Open-Source Distilled Models!

🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced

📏 32B & 70B models on par with OpenAI-o1-mini

🤝 Empowering the open-source community

🌍 Pushing the boundaries of open AI!

📜 License Update!

🔄 DeepSeek-R1 is now MIT licensed for clear open access

🔓 Open for the community to leverage model weights & outputs

API outputs can now be used for fine-tuning & distillation

DeepSeek-R1: Technical Highlights

📈 Large-scale RL in post-training
🏆 Significant performance boost with minimal labeled data

🔢 Math, code, and reasoning tasks on par with OpenAI-o1

📄 More details:
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

🌐 API Access & Pricing

⚙️ Use DeepSeek-R1 by setting model=deepseek-reasoner

💰 $0.14 / million input tokens (cache hit)

💰 $0.55 / million input tokens (cache miss)

💰 $2.19 / million output tokens

📖 API guide: https://api-docs.deepseek.com/guides/reasoning_model

PreviousError CodesNextIntroducing DeepSeek AppWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250115

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek APP
2025/01/15On this pageIntroducing DeepSeek App

💡 Powered by world-class DeepSeek-V3


🆓 FREE to use with seamless interaction

📱 Now officially available on App Store & Google Play & Major Android markets

🔗Download now: https://download.deepseek.com/app/

Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID

☁️ Cross-platform chat history sync

🔍 Web search & Deep-Think mode

📄 File upload & text extraction

Important Notice:

✅ 100% FREE - No ads, no in-app purchases

Download only from official channels to avoid being misled

📲 Search "DeepSeek" in your app store or visit our website for direct links

PreviousDeepSeek-R1 ReleaseNext🚀 Introducing DeepSeek-V3Key Features of DeepSeek


App:Important Notice:WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1226

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1210

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-V2.5-
1210 Release 2024/12/10🚀 DeepSeek V2.5: The Grand Finale 🎉
🌐 Internet Search is now live on the web! Visit https://chat.deepseek.com/ and
toggle “Internet Search” for real-time answers. 🕒

📊 DeepSeek-V2.5-1210 raises the bar across benchmarks like math, coding, writing,
and roleplay—built to serve all your work and life needs.
🔧 Explore the open-source model on Hugging Face: https://huggingface.co/deepseek-
ai/DeepSeek-V2.5-1210

🙌 With the release of DeepSeek-V2.5-1210, the V2.5 series comes to an end.


💪 Since May, the DeepSeek V2 series has brought 5 impactful updates, earning your
trust and support along the way.
✨ As V2 closes, it’s not the end—it’s the beginning of something greater. DeepSeek
is working on next-gen foundation models to push boundaries even further. Stay
tuned!
“Every end is a new beginning.” Previous🚀 Introducing DeepSeek-V3Next🚀 DeepSeek-R1-
Lite-Preview is now live: unleashing supercharged reasoning power!WeChat Official
Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1120

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1-Lite
Release 2024/11/20🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged
reasoning power!
🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
Open-source models & API coming soon!
🌐 Try it now at http://chat.deepseek.com

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview


Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score
improvements on AIME as thought length increases.
Previous🚀 DeepSeek V2.5: The Grand Finale 🎉NextDeepSeek-V2.5: A New Open-Source
Model Combining General and Coding CapabilitiesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news0905

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-V2.5
Release 2024/09/05On this pageDeepSeek-V2.5: A New Open-Source Model Combining
General and Coding Capabilities
We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-
0628 and DeepSeek-Coder-V2-0724! This new version not only retains the general
conversational capabilities of the Chat model and the robust code processing power
of the Coder model but also better aligns with human preferences. Additionally,
DeepSeek-V2.5 has seen significant improvements in tasks such as writing and
instruction-following. The model is now available on both the web and API, with
backward-compatible API endpoints. Users can access the new model via deepseek-
coder or deepseek-chat. Features like Function Calling, FIM completion, and JSON
output remain unchanged. The all-in-one DeepSeek-V2.5 offers a more streamlined,
intelligent, and efficient user experience.
Version History
DeepSeek has consistently focused on model refinement and optimization. In June, we
upgraded DeepSeek-V2-Chat by replacing its base model with the Coder-V2-base,
significantly enhancing its code generation and reasoning capabilities. This led to
the release of DeepSeek-V2-Chat-0628. Shortly after, DeepSeek-Coder-V2-0724 was
launched, featuring improved general capabilities through alignment optimization.
Ultimately, we successfully merged the Chat and Coder models to create the new
DeepSeek-V2.5.

Note: Due to significant updates in this version, if performance drops in certain


cases, we recommend adjusting the system prompt and temperature settings for the
best results!
General Capabilities

General Capability Evaluation

We assessed DeepSeek-V2.5 using industry-standard test sets. DeepSeek-V2.5


outperforms both DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724 on most benchmarks. In
our internal Chinese evaluations, DeepSeek-V2.5 shows a significant improvement in
win rates against GPT-4o mini and ChatGPT-4o-latest (judged by GPT-4o) compared to
DeepSeek-V2-0628, especially in tasks like content creation and Q&A, enhancing the
overall user experience.

Safety Evaluation
Balancing safety and helpfulness has been a key focus during our iterative
development. In DeepSeek-V2.5, we have more clearly defined the boundaries of model
safety, strengthening its resistance to jailbreak attacks while reducing the
overgeneralization of safety policies to normal queries.
ModelOverall Safety Score (higher is better)*Safety Spillover Rate (lower is
better)**DeepSeek-V2-062874.4%11.3%DeepSeek-V2.582.6%4.6%
* Scores based on internal test sets: higher scores indicates greater overall
safety.
** Scores based on internal test sets:lower percentages indicate less impact of
safety measures on normal queries.
Code Capabilities
In the coding domain, DeepSeek-V2.5 retains the powerful code capabilities of
DeepSeek-Coder-V2-0724. It demonstrated notable improvements in the HumanEval
Python and LiveCodeBench (Jan 2024 - Sep 2024) tests. While DeepSeek-Coder-V2-0724
slightly outperformed in HumanEval Multilingual and Aider tests, both versions
performed relatively low in the SWE-verified test, indicating areas for further
improvement. Moreover, in the FIM completion task, the DS-FIM-Eval internal test
set showed a 5.1% improvement, enhancing the plugin completion experience.
DeepSeek-V2.5 has also been optimized for common coding scenarios to improve user
experience. In the DS-Arena-Code internal subjective evaluation, DeepSeek-V2.5
achieved a significant win rate increase against competitors, with GPT-4o serving
as the judge.

Open-Source
DeepSeek-V2.5 is now open-source on HuggingFace! Check it out:
https://huggingface.co/deepseek-ai/DeepSeek-V2.5Previous🚀 DeepSeek-R1-Lite-Preview
is now live: unleashing supercharged reasoning power!NextDeepSeek API introduces
Context Caching on Disk, cutting prices by an order of magnitudeVersion
HistoryGeneral CapabilitiesCode CapabilitiesOpen-SourceWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news0802

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsContext Caching
is Available 2024/08/02On this pageDeepSeek API introduces Context Caching on Disk,
cutting prices by an order of magnitude
In large language model API usage, a significant portion of user inputs tends to be
repetitive. For instance, user prompts often include repeated references, and in
multi-turn conversations, previous content is frequently re-entered.
To address this, DeepSeek has implemented Context Caching on Disk technology. This
innovative approach caches content that is expected to be reused on a distributed
disk array. When duplicate inputs are detected, the repeated parts are retrieved
from the cache, bypassing the need for recomputation. This not only reduces service
latency but also significantly cuts down on overall usage costs.
For cache hits, DeepSeek charges $0.014 per million tokens, slashing API costs by
up to 90%1.

Hint 1: The API price of DeepSeek-V3 has been updated. For details, please refer to
Models & Pricing.

How to Use DeepSeek API's Caching Service


The disk caching service is now available for all users, requiring no code or
interface changes. The cache service runs automatically, and billing is based on
actual cache hits.
Note that only requests with identical prefixes (starting from the 0th token) will
be considered duplicates. Partial matches in the middle of the input will not
trigger a cache hit.
Here are two classic cache usage scenarios:
1. Multi-turn conversation: The next turn can hit the context cache generated by
the previous turn.

2. Data analysis: Subsequent requests with the same prefix can hit the context
cache.

Beneficial Scenarios for Context Caching on Disk:

Q&A assistants with long preset prompts


Role-play with extensive character settings and multi-turn conversations
Data analysis with recurring queries on the same documents/files
Code analysis and debugging with repeated repository references
Improve model output performance through Few-shot learning.
...

For more detailed instructions, please refer to the guide Use Context Caching.
Monitoring Cache Hits
Two new fields in the API response's usage section help users monitor cache
performance:

prompt_cache_hit_tokens:Number of tokens from the input that were served from the
cache ($0.014 per million tokens)
prompt_cache_miss_tokens: Number of tokens from the input that were not served from
the cache ($0.14 per million tokens)

Reducing Latency
First token latency will be significantly reduced in requests with long, repetitive
inputs.
For a 128K prompt with high reference, the first token latency is cut from 13s to
just 500ms.
Lowering Costs
Users can save up to 90% on costs with optimization for cache characteristics.
Even without any optimization, historical data shows that users save over 50% on
average.
The service has no additional fees beyond the $0.014 per million tokens for cache
hits, and storage usage for the cache is free.
Security Concerns
The cache system is designed with robust security strategy.
Each user's cache is isolated and logically invisible to others, ensuring data
privacy and security.
Unused cache entries are automatically cleared after a period, ensuring they are
not retained or repurposed.
Why DeepSeek Leads with Disk Caching
Based on publicly available information, DeepSeek appears to be the first large
language model provider globally to implement extensive disk caching in API
services.
This is made possible by the MLA architecture in DeepSeek V2, which enhances model
performance while significantly reducing the size of the context KV cache, enabling
efficient storage on low-cost disks.
DeepSeek API’s Concurrency and Rate Limits
The DeepSeek API is designed to handle up to 1 trillion tokens per day, with no
limits on concurrency or rate, ensuring high-quality service for all users. Feel
free to scale up your parallelism.

The cache system uses 64 tokens as a storage unit; content less than 64 tokens will
not be cached.
The cache system does not guarantee 100% cache hits.
Unused cache entries are automatically cleared, typically within a few hours to
days.PreviousDeepSeek-V2.5: A New Open-Source Model Combining General and Coding
CapabilitiesNextDeepSeek API UpgradeHow to Use DeepSeek API's Caching
ServiceMonitoring Cache HitsReducing LatencyLowering CostsSecurity ConcernsWhy
DeepSeek Leads with Disk CachingDeepSeek API’s Concurrency and Rate LimitsWeChat
Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news0725

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsNew API Features
2024/07/25On this pageDeepSeek API Upgrade
Now Supporting Chat Prefix Completion, FIM, Function Calling and JSON Output
Today, the DeepSeek API releases a major update, equipped with new interface
features to unlock more potential of the model:

Update API /chat/completions

JSON Output
Function Calling
Chat Prefix Completion (Beta)
8K max_tokens (Beta)

New API /completions

FIM Completion (Beta)

All new features above are open to the two models: deepseek-chat and deepseek-
coder.

Update API /chat/completions


1. JSON Output, Strengthen Formatted Output
DeepSeek API now supports JSON Output,compatible with OpenAI API,enforces the model
to output valid JSON format string.
When performing tasks such as data processing, this feature allows the model to
return JSON in a predefined format, facilitating the subsequent parsing of the
model's output and enhancing the automation capabilities of the program flow.
To use JSON Output,users need to:

Set response_format to {'type': 'json_object'}


Guide the model to output JSON format in the prompt to ensure that the output
format meets your expectations
Set max_tokens appropriately to prevent the JSON string from being truncated midway

The following is an example of JSON Output.In this example, the user provides a
piece of text, and the model formats the questions and answers within the text into
JSON.

For detailed guide, please refer to JSON Output Guide.

2. Function Calling, Connecting The Physical World


DeepSeek API now supports Function Calling, compatible with OpenAI API, allows the
model to interact with the physical world via externel tools.
Function Calling supports multiple functions in one call (up to 128). It supports
parallel function calls.
The image below demonstrates the integration of deepseek-coder into the open-source
large model frontend LobeChat. In this example, we enabled the "Website Crawler"
plugin to perform website crawling and summarization.

The image below illustrates the interaction process using the Function Calling
feature:

For detailed guide, please refer to Function Calling Guide.

3. Chat Prefix Completion (Beta), More Flexible Output Control


Chat Prefix Completion follows the API format of Chat Completion, allowing users to
specify the prefix of the last assistant message for the model to complete. This
feature can also be used to concatenate messages that were truncated due to
reaching the max_tokens limit and resend the request to continue the truncated
content.
To use Chat Prefix Completion, user needs to:

Set base_url to https://api.deepseek.com/beta to enable the Beta features


Ensure that the role of the last message in the messages list is assistant, and set
the prefix parameter of the last message to True, for example: {"role":
"assistant", "content": "Once upon a time,", "prefix": True}

The following is an example of using Chat Prefix Completion. In this example, the
beginning of the assistant message is set to '```python\n' to enforce the output to
start with a code block, and the stop parameter is set to '```' to prevent the
model from outputting extra content.

For detailed guide, please refer to Chat Prefix Completion Guide.

4. 8K max_tokens (Beta),Release Longer Possibilities


To accommodate scenarios requiring longer text output, we have adjusted the upper
limit of the max_tokens parameter to 8K in the Beta API.
To use 8K max_tokens, user needs to:

Set base_url to https://api.deepseek.com/beta to enable the Beta features


max_tokens is default to 4096. By enabling the Beta API,max_tokens can be set up to
8192

New API /completions


1. FIM Completion (Beta), Enabling More Completion Scenarios
DeepSeek API now supports FIM (Fill-In-the-Middle) Completion,compatible with
OpenAI FIM Completion API,allowing users to provide custom prefixes/suffixes
(optional) for the model to complete the content. This feature is commonly used in
scenarios such as story completion and code completion. The FIM Completion API is
charged the same as the Chat Completion API.
To use FIM Completion, user needs to set base_url to https://api.deepseek.com/beta
to enable the Beta features.
The following is an example of using the FIM Completion API. In this example, the
user provides the beginning and the end of a Fibonacci sequence function, and the
model completes the content in the middle.

For detailed guide, please refer to FIM Completion Guide.

Update Statements
The Beta API is open for all users. User needs to set base_url to
https://api.deepseek.com/beta to enable the Beta features
Beta API are considered unstable and their subsequent testing and release plans may
change flexibly. Thank you for your understanding.
The related model versions will be released to the open-source community once the
functionality is stable.PreviousDeepSeek API introduces Context Caching on Disk,
cutting prices by an order of magnitudeNextIntroductionNow Supporting Chat Prefix
Completion, FIM, Function Calling and JSON OutputUpdate API /chat/completionsNew
API /completionsUpdate StatementsWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/api/deepseek-api

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceIntroductionChatCompletionsModelsOthersAPI GuidesReasoning Model
(deepseek-reasoner)Multi-round ConversationChat Prefix Completion (Beta)FIM
Completion (Beta)JSON OutputFunction CallingContext CachingOther
ResourcesIntegrationsAPI Status PageFAQChange LogAPI ReferenceIntroductionVersion:
1.0.0
DeepSeek API
The DeepSeek API. To use the DeepSeek API, please create an API key first.
AuthenticationHTTP: Bearer AuthSecurity Scheme Type:httpHTTP Authorization
Scheme:bearer
ContactDeepSeek Support: [email protected]
Terms of Servicehttps://platform.deepseek.com/downloads/DeepSeek%20Open%20Platform
%20Terms%20of%20Service.html
LicenseMITPreviousDeepSeek API UpgradeNextCreate Chat CompletionWeChat Official
Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/guides/reasoning_model

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesReasoning
Model (deepseek-reasoner)On this pageReasoning Model (deepseek-reasoner)
deepseek-reasoner is a reasoning model developed by DeepSeek. Before delivering the
final answer, the model first generates a Chain of Thought (CoT) to enhance the
accuracy of its responses. Our API provides users with access to the CoT content
generated by deepseek-reasoner, enabling them to view, display, and distill it.
When using deepseek-reasoner, please upgrade the OpenAI SDK first to support the
new parameters.
pip3 install -U openai
API Parameters

Input:

max_tokens:The maximum length of the final response after the CoT output is
completed, defaulting to 4K, with a maximum of 8K. Note that the CoT output can
reach up to 32K tokens, and the parameter to control the CoT length
(reasoning_effort) will be available soon.

Output:

reasoning_content:The content of the CoT,which is at the same level as content in


the output structure. See API Example for details
contentThe content of the final answer

Context Length:The API supports a maximum context length of 64K, and the length of
the output reasoning_content is not counted within the 64K context length.

Supported Features:Chat Completion、Chat Prefix Completion (Beta)

Not Supported Features:Function Call、Json Output、FIM (Beta)


Not Supported Parameters:
temperature、top_p、presence_penalty、frequency_penalty、logprobs、top_logprobs. Please
note that to ensure compatibility with existing software, setting
temperature、top_p、presence_penalty、frequency_penalty will not trigger an error but
will also have no effect. Setting logprobs、top_logprobs will trigger an error.

Multi-round Conversation
In each round of the conversation, the model outputs the CoT (reasoning_content)
and the final answer (content). In the next round of the conversation, the CoT from
previous rounds is not concatenated into the context, as illustrated in the
following diagram:

Please note that if the reasoning_content field is included in the sequence of


input messages, the API will return a 400 error. Therefore, you should remove the
reasoning_content field from the API response before making the API request, as
demonstrated in the API example.
API Example
The following code, using Python as an example, demonstrates how to access the CoT
and the final answer, as well as how to conduct multi-round conversations:

NoStreamingStreamingfrom openai import OpenAIclient = OpenAI(api_key="<DeepSeek API


Key>", base_url="https://api.deepseek.com")# Round 1messages = [{"role": "user",
"content": "9.11 and 9.8, which is greater?"}]response =
client.chat.completions.create( model="deepseek-reasoner",
messages=messages)reasoning_content =
response.choices[0].message.reasoning_contentcontent =
response.choices[0].message.content# Round 2messages.append({'role': 'assistant',
'content': content})messages.append({'role': 'user', 'content': "How many Rs are
there in the word 'strawberry'?"})response =
client.chat.completions.create( model="deepseek-reasoner",
messages=messages)# ...from openai import OpenAIclient = OpenAI(api_key="<DeepSeek
API Key>", base_url="https://api.deepseek.com")# Round 1messages = [{"role":
"user", "content": "9.11 and 9.8, which is greater?"}]response =
client.chat.completions.create( model="deepseek-reasoner", messages=messages,
stream=True)reasoning_content = ""content = ""for chunk in response: if
chunk.choices[0].delta.reasoning_content: reasoning_content +=
chunk.choices[0].delta.reasoning_content else: content +=
chunk.choices[0].delta.content# Round 2messages.append({"role": "assistant",
"content": content})messages.append({'role': 'user', 'content': "How many Rs are
there in the word 'strawberry'?"})response =
client.chat.completions.create( model="deepseek-reasoner", messages=messages,
stream=True)# ...PreviousGet User BalanceNextMulti-round ConversationAPI
ParametersMulti-round ConversationAPI ExampleWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/guides/multi_round_chat

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesMulti-round
ConversationMulti-round Conversation
This guide will introduce how to use the DeepSeek /chat/completions API for multi-
turn conversations.
The DeepSeek /chat/completions API is a "stateless" API, meaning the server does
not record the context of the user's requests. Therefore, the user must concatenate
all previous conversation history and pass it to the chat API with each request.
The following code in Python demonstrates how to concatenate context to achieve
multi-turn conversations.
from openai import OpenAIclient = OpenAI(api_key="<DeepSeek API Key>",
base_url="https://api.deepseek.com")# Round 1messages = [{"role": "user",
"content": "What's the highest mountain in the world?"}]response =
client.chat.completions.create( model="deepseek-chat",
messages=messages)messages.append(response.choices[0].message)print(f"Messages
Round 1: {messages}")# Round 2messages.append({"role": "user", "content": "What is
the second?"})response = client.chat.completions.create( model="deepseek-chat",
messages=messages)messages.append(response.choices[0].message)print(f"Messages
Round 2: {messages}")

In the first round of the request, the messages passed to the API are:
[ {"role": "user", "content": "What's the highest mountain in the world?"}]
In the second round of the request:

Add the model's output from the first round to the end of the messages.
Add the new question to the end of the messages.

The messages ultimately passed to the API are:


[ {"role": "user", "content": "What's the highest mountain in the world?"},
{"role": "assistant", "content": "The highest mountain in the world is Mount
Everest."}, {"role": "user", "content": "What is the second?"}]PreviousReasoning
Model (deepseek-reasoner)NextChat Prefix Completion (Beta)WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/guides/chat_prefix_completion

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesChat Prefix
Completion (Beta)On this pageChat Prefix Completion (Beta)
The chat prefix completion follows the Chat Completion API, where users provide an
assistant's prefix message for the model to complete the rest of the message.
Notice
When using chat prefix completion, users must ensure that the role of the last
message in the messages list is assistant and set the prefix parameter of the last
message to True.
The user needs to set base_url="https://api.deepseek.com/beta" to enable the Beta
feature.

Sample Code
Below is a complete Python code example for chat prefix completion. In this
example, we set the prefix message of the assistant to "```python\n" to force the
model to output Python code, and set the stop parameter to ['```'] to prevent
additional explanations from the model.
from openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com/beta",)messages = [ {"role": "user",
"content": "Please write quick sort code"}, {"role": "assistant", "content":
"```python\n", "prefix": True}]response =
client.chat.completions.create( model="deepseek-chat", messages=messages,
stop=["```"],)print(response.choices[0].message.content)PreviousMulti-round
ConversationNextFIM Completion (Beta)NoticeSample CodeWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/guides/fim_completion

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesFIM
Completion (Beta)On this pageFIM Completion (Beta)
In FIM (Fill In the Middle) completion, users can provide a prefix and a suffix
(optional), and the model will complete the content in between. FIM is commonly
used for content completion、code completion.
Notice

The max tokens of FIM completion is 4K.


The user needs to set base_url=https://api.deepseek.com/beta to enable the Beta
feature.

Sample Code
Below is a complete Python code example for FIM completion. In this example, we
provide the beginning and the end of a function to calculate the Fibonacci
sequence, allowing the model to complete the content in the middle.
from openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com/beta",)response =
client.completions.create( model="deepseek-chat", prompt="def fib(a):",
suffix=" return fib(a-1) + fib(a-2)",
max_tokens=128)print(response.choices[0].text)
Integration With Continue
Continue is a VSCode plugin that supports code completion. You can refer to this
document to configure Continue for using the code completion feature.PreviousChat
Prefix Completion (Beta)NextJSON OutputNoticeSample CodeIntegration With
ContinueWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/guides/json_mode

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesJSON
OutputOn this pageJSON Output
In many scenarios, users need the model to output in strict JSON format to achieve
structured output, facilitating subsequent parsing.
DeepSeek provides JSON Output to ensure the model outputs valid JSON strings.
Notice
To enable JSON Output, users should:

Set the response_format parameter to {'type': 'json_object'}.


Include the word "json" in the system or user prompt, and provide an example of the
desired JSON format to guide the model in outputting valid JSON.
Set the max_tokens parameter reasonably to prevent the JSON string from being
truncated midway.
When using the JSON Output feature, the API may occasionally return empty content.
We are actively working on optimizing this issue. You can try modifying the prompt
to mitigate such problems.

Sample Code
Here is the complete Python code demonstrating the use of JSON Output:
import jsonfrom openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)system_prompt = """The user will provide some
exam text. Please parse the "question" and "answer" and output them in JSON format.
EXAMPLE INPUT: Which is the highest mountain in the world? Mount Everest.EXAMPLE
JSON OUTPUT:{ "question": "Which is the highest mountain in the world?",
"answer": "Mount Everest"}"""user_prompt = "Which is the longest river in the
world? The Nile River."messages = [{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}]response = client.chat.completions.create(
model="deepseek-chat", messages=messages, response_format={ 'type':
'json_object' })print(json.loads(response.choices[0].message.content))
The model will output:
{ "question": "Which is the longest river in the world?", "answer": "The Nile
River"}PreviousFIM Completion (Beta)NextFunction CallingNoticeSample CodeWeChat
Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------
https://api-docs.deepseek.com/guides/function_calling

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesFunction
CallingOn this pageFunction Calling
Function Calling allows the model to call external tools to enhance its
capabilities.
Notice
The current version of the deepseek-chat model's Function Calling capabilitity is
unstable, which may result in looped calls or empty responses. We are actively
working on a fix, and it is expected to be resolved in the next version.
Sample Code
Here is an example of using Function Calling to get the current weather information
of the user's location, demonstrated with complete Python code.
For the specific API format of Function Calling, please refer to the Chat
Completion documentation.
from openai import OpenAIdef send_messages(messages): response =
client.chat.completions.create( model="deepseek-chat",
messages=messages, tools=tools ) return
response.choices[0].messageclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)tools = [ { "type": "function",
"function": { "name": "get_weather", "description": "Get
weather of an location, the user shoud supply a location first",
"parameters": { "type": "object", "properties": {
"location": { "type": "string",
"description": "The city and state, e.g. San Francisco, CA", }
}, "required": ["location"] }, } },]messages =
[{"role": "user", "content": "How's the weather in Hangzhou?"}]message =
send_messages(messages)print(f"User>\t {messages[0]['content']}")tool =
message.tool_calls[0]messages.append(message)messages.append({"role": "tool",
"tool_call_id": tool.id, "content": "24℃"})message =
send_messages(messages)print(f"Model>\t {message.content}")
The execution flow of this example is as follows:

User: Asks about the current weather in Hangzhou


Model: Returns the function get_weather({location: 'Hangzhou'})
User: Calls the function get_weather({location: 'Hangzhou'}) and provides the
result to the model
Model: Returns in natural language, "The current temperature in Hangzhou is 24°C."

Note: In the above code, the functionality of the get_weather function needs to be
provided by the user. The model itself does not execute specific
functions.PreviousJSON OutputNextContext CachingNoticeSample CodeWeChat Official
Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------
https://api-docs.deepseek.com/guides/kv_cache

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogAPI GuidesContext
CachingOn this pageContext Caching
The DeepSeek API Context Caching on Disk Technology is enabled by default for all
users, allowing them to benefit without needing to modify their code.
Each user request will trigger the construction of a hard disk cache. If subsequent
requests have overlapping prefixes with previous requests, the overlapping part
will only be fetched from the cache, which counts as a "cache hit."
Note: Between two requests, only the repeated prefix part can trigger a "cache
hit." Please refer to the example below for more details.

Example 1: Long Text Q&A


First Request
messages: [ {"role": "system", "content": "You are an experienced financial
report analyst..."} {"role": "user", "content": "<financial report content>\n\
nPlease summarize the key information of this financial report."}]
Second Request
messages: [ {"role": "system", "content": "You are an experienced financial
report analyst..."} {"role": "user", "content": "<financial report content>\n\
nPlease analyze the profitability of this financial report."}]
In the above example, both requests have the same prefix, which is the system
message + <financial report content> in the user message. During the second
request, this prefix part will count as a "cache hit."

Example 2: Multi-round Conversation


First Request
messages: [ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "What is the capital of China?"}]
Second Request
messages: [ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "What is the capital of China?"}, {"role":
"assistant", "content": "The capital of China is Beijing."}, {"role": "user",
"content": "What is the capital of the United States?"}]
In this example, the second request can reuse the initial system message and user
message from the first request, which will count as a "cache hit."

Example 3: Using Few-shot Learning


In practical applications, users can enhance the model's output performance through
few-shot learning. Few-shot learning involves providing a few examples in the
request to allow the model to learn a specific pattern. Since few-shot generally
provides the same context prefix, the cost of few-shot is significantly reduced
with the support of context caching.
First Request
messages: [ {"role": "system", "content": "You are a history expert. The
user will provide a series of questions, and your answers should be concise and
start with `Answer:`"}, {"role": "user", "content": "In what year did Qin Shi
Huang unify the six states?"}, {"role": "assistant", "content": "Answer: 221
BC"}, {"role": "user", "content": "Who was the founder of the Han Dynasty?"},
{"role": "assistant", "content": "Answer: Liu Bang"}, {"role": "user",
"content": "Who was the last emperor of the Tang Dynasty?"}, {"role":
"assistant", "content": "Answer: Li Zhu"}, {"role": "user", "content": "Who was
the founding emperor of the Ming Dynasty?"}, {"role": "assistant", "content":
"Answer: Zhu Yuanzhang"}, {"role": "user", "content": "Who was the founding
emperor of the Qing Dynasty?"}]
Second Request
messages: [ {"role": "system", "content": "You are a history expert. The
user will provide a series of questions, and your answers should be concise and
start with `Answer:`"}, {"role": "user", "content": "In what year did Qin Shi
Huang unify the six states?"}, {"role": "assistant", "content": "Answer: 221
BC"}, {"role": "user", "content": "Who was the founder of the Han Dynasty?"},
{"role": "assistant", "content": "Answer: Liu Bang"}, {"role": "user",
"content": "Who was the last emperor of the Tang Dynasty?"}, {"role":
"assistant", "content": "Answer: Li Zhu"}, {"role": "user", "content": "Who was
the founding emperor of the Ming Dynasty?"}, {"role": "assistant", "content":
"Answer: Zhu Yuanzhang"}, {"role": "user", "content": "When did the Shang
Dynasty fall?"}, ]
In this example, 4-shots are used. The only difference between the two requests is
the last question. The second request can reuse the content of the first 4 rounds
of dialogue from the first request, which will count as a "cache hit."

Checking Cache Hit Status


In the response from the DeepSeek API, we have added two fields in the usage
section to reflect the cache hit status of the request:

prompt_cache_hit_tokens: The number of tokens in the input of this request that


resulted in a cache hit (0.1 yuan per million tokens).

prompt_cache_miss_tokens: The number of tokens in the input of this request that


did not result in a cache hit (1 yuan per million tokens).

Hard Disk Cache and Output Randomness


The hard disk cache only matches the prefix part of the user's input. The output is
still generated through computation and inference, and it is influenced by
parameters such as temperature, introducing randomness.
Additional Notes

The cache system uses 64 tokens as a storage unit; content less than 64 tokens will
not be cached.

The cache system works on a "best-effort" basis and does not guarantee a 100% cache
hit rate.

Cache construction takes seconds. Once the cache is no longer in use, it will be
automatically cleared, usually within a few hours to a few days.

PreviousFunction CallingNextFAQExample 1: Long Text Q&AExample 2: Multi-round


ConversationExample 3: Using Few-shot LearningChecking Cache Hit StatusHard Disk
Cache and Output RandomnessAdditional NotesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/faq

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogFAQOn this pageFAQ
Account
Cannot sign in to my account
Your recent account activity may have triggered our automated risk control
strategy, resulting in the temporary suspension of your access to the account. If
you wish to appeal, please fill out this form, and we will process it as soon as
possible.
Cannot register with my email
If you encounter an error message saying "Login failed. Your email domain is
currently not supported for registration." during registration, it is because your
email is not supported by DeepSeek. Please switch to a different email service
provider. If the issue persists, please contact [email protected].

Billing
Is there any expiration date for my balance?
Your topped-up balance will not expire. You can check the expiration date of the
granted balance on the billing page.

API Call
Are there any rate limits when calling your API? Can I increase the limits for my
account?
The rate limit exposed on each account is adjusted dynamically according to our
real-time traffic pressure and each account's short-term historical usage.
We temporarily do not support increasing the dynamic rate limit exposed on any
individual account, thanks for your understanding.
Why do I feel that your API's speed is slower than the web service?
The web service uses streaming output, i.e., every time the model outputs a token,
it will be displayed incrementally on the web page.
The API uses non-streaming output (stream=false) by default, i.e., the model's
output will not be returned to the user until the generation is done completely.
You can use streaming output in your API call to optimize interactivity.
Why are empty lines continuously returned when calling the API?
To prevent the TCP connection from being interrupted due to timeout, we
continuously return empty lines (for non-streaming requests) or SSE keep-alive
comments ( : keep-alive,for streaming requests) while waiting for the request to be
scheduled. If you are parsing the HTTP response yourself, please make sure to
handle these empty lines or comments appropriately.
Does your API support LangChain?
Yes. You can refer to the demo code below, which demonstrates how to use LangChain
with DeepSeek API. Replace the API key in the code as necessary.
deepseek_langchain.py
How to calculate token usage offline?
Please refer to Token & Token UsagePreviousContext CachingNextChange
LogAccountCannot sign in to my accountCannot register with my emailBillingIs there
any expiration date for my balance?API CallAre there any rate limits when calling
your API? Can I increase the limits for my account?Why do I feel that your API's
speed is slower than the web service?Why are empty lines continuously returned when
calling the API?Does your API support LangChain?How to calculate token usage
offline?WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/updates

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogChange LogOn this
pageChange Log

Version: 2025-01-20
deepseek-reasoner

deepseek-reasoner is our new model DeepSeek-R1. You can invoke DeepSeek-V3 by


specifying model='deepseek-reasoner'.
For details, please refer to: DeepSeek-R1 Release
For guides, please refer to: Reasoning Model

Version: 2024-12-26
deepseek-chat

The deepseek-chat model has been upgraded to DeepSeek-V3. The API remains
unchanged. You can invoke DeepSeek-V3 by specifying model='deepseek-chat'.
For details, please refer to: introducing DeepSeek-V3

Version: 2024-12-10
deepseek-chat
The deepseek-chat model has been upgraded to DeepSeek-V2.5-1210, with improvements
across various capabilities. Relevant benchmarking results include:

Mathematical: Performance on the MATH-500 benchmark has improved from 74.8% to


82.8% .
Coding: Accuracy on the LiveCodebench (08.01 - 12.01) benchmark has increased from
29.2% to 34.38% .
Writing and Reasoning: Corresponding improvements have been observed in internal
test datasets.

Additionally, the new version of the model has optimized the user experience for
file upload and webpage summarization functionalities.

Version: 2024-09-05
deepseek-coder & deepseek-chat Upgraded to DeepSeek V2.5 Model
The DeepSeek V2 Chat and DeepSeek Coder V2 models have been merged and upgraded
into the new model, DeepSeek V2.5.
For backward compatibility, API users can access the new model through either
deepseek-coder or deepseek-chat.
The new model significantly surpasses the previous versions in both general
capabilities and code abilities.
The new model better aligns with human preferences and has been optimized in
various areas such as writing tasks and instruction following:

ArenaHard win rate improved from 68.3% to 76.3%


AlpacaEval 2.0 LC win rate increased from 46.61% to 50.52%
MT-Bench score rose from 8.84 to 9.02
AlignBench score increased from 7.88 to 8.04

The new model has further enhanced its code generation capabilities based on the
original Coder model, optimized for common programming application scenarios, and
achieved the following results on the standard test set:

HumanEval: 89%
LiveCodeBench (January-September): 41%

Version: 2024-08-02
API Launches Context Caching on Disk Technology
The DeepSeek API has innovatively adopted hard disk caching, reducing prices by
another order of magnitude.
For more details on the update, please refer to the documentation Context Caching
is Available 2024/08/02.

Version:2024-07-25
New API Features

Update API /chat/completions

JSON Mode
Function Calling
Chat Prefix Completion(Beta)
8K max_tokens(Beta)

New API /completions

FIM Completion(Beta)

For more details, please check the documentation New API Features 2024/07/25

Version: 2024-07-24
deepseek-coder
The deepseek-coder model has been upgraded to DeepSeek-Coder-V2-0724.

Version: 2024-06-28
deepseek-chat
The deepseek-chat model has been upgraded to DeepSeek-V2-0628.
Model's reasoning capabilities have improved, as shown in relevant benchmarks:

Coding: HumanEval Pass@1 79.88% -> 84.76%


Mathematics: MATH ACC@1 55.02% -> 71.02%
Reasoning: BBH 78.56% -> 83.40%

In the Arena-Hard evaluation, the win rate against GPT-4-0314 increased from 41.6%
to 68.3%.
The model's role-playing capabilities have significantly enhanced, allowing it to
act as different characters as requested during conversations.

Version: 2024-06-14
deepseek-coder
The deepseek-coder model has been upgraded to DeepSeek-Coder-V2-0614, significantly
enhancing its coding capabilities. It has reached the level of GPT-4-Turbo-0409 in
code generation, code understanding, code debugging, and code completion.
Additionally, it possesses excellent mathematical and reasoning abilities, and its
general capabilities are on par with DeepSeek-V2-0517.

Version: 2024-05-17
deepseek-chat
The deepseek-chat model has been upgraded to DeepSeek-V2-0517. The model has seen a
significant improvement in following instructions, with the IFEval Benchmark
Prompt-Level accuracy jumping from 63.9% to 77.6%. Additionally, on API end, we
have optimized model ability to follow instruction filled in the ``system" part.
This optimization has significantly elevated the user experience across a variety
of tasks, including immersive translation, Retrieval-Augmented Generation (RAG),
and more.
The model's accuracy in outputting JSON format has been enhanced. In our internal
test set, the JSON parsing rate increased from 78% to 85%. By introducing
appropriate regular expressions, the JSON parsing rate was further improved to
97%.PreviousFAQVersion: 2025-01-20deepseek-reasonerVersion: 2024-12-26deepseek-
chatVersion: 2024-12-10deepseek-chatVersion: 2024-09-05deepseek-coder & deepseek-
chat Upgraded to DeepSeek V2.5 ModelVersion: 2024-08-02API Launches Context Caching
on Disk TechnologyVersion:2024-07-25New API FeaturesVersion: 2024-07-24deepseek-
coderVersion: 2024-06-28deepseek-chatVersion: 2024-06-14deepseek-coderVersion:
2024-05-17deepseek-chatWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/#invoke-the-chat-api

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartYour First
API CallOn this pageYour First API Call
The DeepSeek API uses an API format compatible with OpenAI. By modifying the
configuration, you can use the OpenAI SDK or softwares compatible with the OpenAI
API to access the DeepSeek API.
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* To be compatible with OpenAI, you can also use https://api.deepseek.com/v1 as the
base_url. But note that the v1 here has NO relationship with the model's version.*
The deepseek-chat model has been upgraded to DeepSeek-V3. The API remains
unchanged. You can invoke DeepSeek-V3 by specifying model='deepseek-chat'.*
deepseek-reasoner is the latest reasoning model, DeepSeek-R1, released by DeepSeek.
You can invoke DeepSeek-R1 by specifying model='deepseek-reasoner'.
Invoke The Chat API
Once you have obtained an API key, you can access the DeepSeek API using the
following example scripts. This is a non-stream example, you can set the stream
parameter to true to get stream response.

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
NextModels & PricingInvoke The Chat APIWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/#__docusaurus_skipToContent_fallback

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始首次调用 API 本页总览首次调用 API
DeepSeek API 使用与 OpenAI 兼容的 API 格式,通过修改配置,您可以使用 OpenAI SDK 来访问 DeepSeek API,或使用与
OpenAI API 兼容的软件。
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* 出于与 OpenAI 兼容考虑,您也可以将 base_url 设置为 https://api.deepseek.com/v1 来使用,但注意,此处 v1 与模型版
本无关。* deepseek-chat 模型已全面升级为 DeepSeek-V3,接口不变。 通过指定 model='deepseek-chat' 即可调用
DeepSeek-V3。* deepseek-reasoner 是 DeepSeek 最新推出的推理模型 DeepSeek-R1。通过指定
model='deepseek-reasoner',即可调用 DeepSeek-R1。
调用对话 API
在创建 API key 之后,你可以使用以下样例脚本的来访问 DeepSeek API。样例为非流式输出,您可以将 stream 设置为 true 来使用流式输出。

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
下一页模型 & 价格调用对话 API 微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/#

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始首次调用 API 本页总览首次调用 API
DeepSeek API 使用与 OpenAI 兼容的 API 格式,通过修改配置,您可以使用 OpenAI SDK 来访问 DeepSeek API,或使用与
OpenAI API 兼容的软件。
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* 出于与 OpenAI 兼容考虑,您也可以将 base_url 设置为 https://api.deepseek.com/v1 来使用,但注意,此处 v1 与模型版
本无关。* deepseek-chat 模型已全面升级为 DeepSeek-V3,接口不变。 通过指定 model='deepseek-chat' 即可调用
DeepSeek-V3。* deepseek-reasoner 是 DeepSeek 最新推出的推理模型 DeepSeek-R1。通过指定
model='deepseek-reasoner',即可调用 DeepSeek-R1。
调用对话 API
在创建 API key 之后,你可以使用以下样例脚本的来访问 DeepSeek API。样例为非流式输出,您可以将 stream 设置为 true 来使用流式输出。

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
下一页模型 & 价格调用对话 API 微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/zh-cn/quick_start/pricing

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始模型 & 价格本页总览模型 & 价格
下表所列模型价格以“百万 tokens”为单位。Token 是模型用来表示自然语言文本的的最小单位,可以是一个词、一个数字或一个标点符号等。我们将根据模型输入和输出的
总 token 数进行计量计费。
模型 & 价格细节

CNYUSD 模型(1)上下文长度最大思维链长度(2)最大输出长度(3)百万 tokens 输入价格(缓存命中)(4)百万 tokens 输入价格(缓存未命中)百万


tokens 输出价格输出价格 deepseek-chat64K-8K0.5 元(5)0.1 元 2 元(5)1 元 8 元(5)2 元 deepseek-
reasoner64K32K8K1 元 4 元 16 元(6)模型(1)上下文长度最大思维链长度(2)最大输出长度(3)百万 tokens 输入价格(缓存命中)(4)百万
tokens 输入价格(缓存未命中)百万 tokens 输出价格输出价格 deepseek-chat64K-8K0.07 美元(5)0.0140.27 美元(5)0.14 美
元 1.10 美元(5)0.28 美元 deepseek-reasoner64K32K8K0.14 美元 0.55 美元 2.19 美元(6)

deepseek-chat 模型已经升级为 DeepSeek-V3;deepseek-reasoner 模型为新模型 DeepSeek-R1。


思维链为 deepseek-reasoner 模型在给出正式回答之前的思考过程,其原理详见推理模型。
如未指定 max_tokens,默认最大输出长度为 4K。请调整 max_tokens 以支持更长的输出。
关于上下文缓存的细节,请参考 DeepSeek 硬盘缓存。
表格中展示了优惠前与优惠后的价格。即日起至北京时间 2025-02-08 24:00,所有用户均可享受 DeepSeek-V3 API 的价格优惠。 在此之后,模型价
格将恢复至原价。DeepSeek-R1 不参与优惠。
deepseek-reasoner 的输出 token 数包含了思维链和最终答案的所有 token,其计价相同。

扣费规则
扣减费用 = token 消耗量 × 模型单价,对应的费用将直接从充值余额或赠送余额中进行扣减。
当充值余额与赠送余额同时存在时,优先扣减赠送余额。
产品价格可能发生变动,DeepSeek 保留修改价格的权利。请您依据实际用量按需充值,定期查看此页面以获知最新价格信息。上一页首次调用 API 下一页
Temperature 设置模型 & 价格细节扣费规则微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/quick_start/parameter_settings

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始 Temperature 设置
Temperature 设置
temperature 参数默认为 1.0。

我们建议您根据如下表格,按使用场景设置 temperature。

场景温度代码生成/数学解题 0.0 数据抽取/分析 1.0 通用对话 1.3 翻译 1.3 创意类写作/诗歌创作 1.5 上一页模型 & 价格下一页 Token 用量计算
微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/quick_start/token_usage

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始 Token 用量计算本页总览 Token 用
量计算
token 是模型用来表示自然语言文本的基本单位,也是我们的计费单元,可以直观的理解为“字”或“词”;通常 1 个中文词语、1 个英文单词、1 个数字或 1 个符号计
为 1 个 token。
一般情况下模型中 token 和字数的换算比例大致如下:

1 个英文字符 ≈ 0.3 个 token。


1 个中文字符 ≈ 0.6 个 token。

但因为不同模型的分词不同,所以换算比例也存在差异,每一次实际处理 token 数量以模型返回为准,您可以从返回结果的 usage 中查看。


离线计算 Tokens 用量
您可以通过如下压缩包中的代码来运行 tokenizer,以离线计算一段文本的 Token 用量。
deepseek_v3_tokenizer.zip 上一页 Temperature 设置下一页限速离线计算 Tokens 用量微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/quick_start/rate_limit

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始限速限速
DeepSeek API 不限制用户并发量,我们会尽力保证您所有请求的服务质量。
但请注意,当我们的服务器承受高流量压力时,您的请求发出后,可能需要等待一段时间才能获取服务器的响应。在这段时间里,您的 HTTP 请求会保持连接,并持续收到如下格式的
返回内容:

非流式请求:持续返回空行
流式请求:持续返回 SSE keep-alive 注释(: keep-alive)

这些内容不影响 OpenAI SDK 对响应的 JSON body 的解析。如果您在自己解析 HTTP 响应,请注意处理这些空行或注释。


如果 30 分钟后,请求仍未完成,服务器将关闭连接。上一页 Token 用量计算下一页错误码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/zh-cn/quick_start/error_codes

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始错误码错误码
您在调用 DeepSeek API 时,可能会遇到以下错误。这里列出了相关错误的原因及其解决方法。
错误码描述 400 - 格式错误原因:请求体格式错误 解决方法:请根据错误信息提示修改请求体 401 - 认证失败原因:API key 错误,认证失败 解决方法:请
检查您的 API key 是否正确,如没有 API key,请先 创建 API key402 - 余额不足原因:账号余额不足 解决方法:请确认账户余额,并前往 充值
页面进行充值 422 - 参数错误原因:请求体参数错误 解决方法:请根据错误信息提示修改相关参数 429 - 请求速率达到上限原因:请求速率(TPM 或 RPM)达到
上限 解决方法:请合理规划您的请求速率。500 - 服务器故障原因:服务器内部故障 解决方法:请等待后重试。若问题一直存在,请联系我们解决 503 - 服务器繁忙
原因:服务器负载过高 解决方法:请稍后重试您的请求上一页限速下一页 DeepSeek-R1 发布,性能对标 OpenAI o1 正式版微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news250120

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek-R1 发布
2025/01/20 本页总览 DeepSeek-R1 发布,性能对标 OpenAI o1 正式版
今天,我们正式发布 DeepSeek-R1,并同步开源模型权重。

DeepSeek-R1 遵循 MIT License,允许用户通过蒸馏技术借助 R1 训练其他模型。

DeepSeek-R1 上线 API,对用户开放思维链输出,通过设置 model='deepseek-reasoner' 即可调用。

DeepSeek 官网与 App 即日起同步更新上线。

性能对齐 OpenAI-o1 正式版


DeepSeek-R1 在后训练阶段大规模使用了强化学习技术,在仅有极少标注数据的情况下,极大提升了模型推理能力。在数学、代码、自然语言推理等任务上,性能比肩
OpenAI o1 正式版。

在此,我们将 DeepSeek-R1 训练技术全部公开,以期促进技术社区的充分交流与创新协作。


论文链接:
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf
蒸馏小模型超越 OpenAI o1-mini
我们在开源 DeepSeek-R1-Zero 和 DeepSeek-R1 两个 660B 模型的同时,通过 DeepSeek-R1 的输出,蒸馏了 6 个小模型开源给
社区,其中 32B 和 70B 模型在多项能力上实现了对标 OpenAI o1-mini 的效果。

HuggingFace 链接:
https://huggingface.co/deepseek-ai
开放的许可证和用户协议
为了推动和鼓励开源社区以及行业生态的发展,在发布并开源 R1 的同时,我们同步在协议授权层面也进行了如下调整:

模型开源 License 统一使用 MIT。我们曾针对大模型开源的特点,参考当前行业的通行实践,特别引入 DeepSeek License 为开源社区提供授权,但实践


表明非标准的开源 License 可能反而增加了开发者的理解成本。为此,此次我们的开源仓库(包括模型权重)统一采用标准化、宽松的 MIT License,完全开源,不
限制商用,无需申请。

产品协议明确可“模型蒸馏”。为了进一步促进技术的开源和共享,我们决定支持用户进行“模型蒸馏”。我们已更新线上产品的用户协议,明确允许用户利用模型输出、通过模型蒸馏等方
式训练其他模型。

App 与网页端
登录 DeepSeek 官网或官方 App,打开“深度思考”模式,即可调用最新版 DeepSeek-R1 完成各类推理任务。

API 及定价
DeepSeek-R1 API 服务定价为每百万输入 tokens 1 元(缓存命中)/ 4 元(缓存未命中),每百万输出 tokens 16 元。

详细的 API 调用指南请参考官方文档:


https://api-docs.deepseek.com/zh-cn/guides/reasoning_model 上一页错误码下一页 DeepSeek APP 性能对
齐 OpenAI-o1 正式版蒸馏小模型超越 OpenAI o1-mini 开放的许可证和用户协议 App 与网页端 API 及定价微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news250115

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek APP 发布
2025/01/15DeepSeek APP
上一页 DeepSeek-R1 发布,性能对标 OpenAI o1 正式版下一页 DeepSeek-V3 正式发布微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news1226

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek-V3 发布
2024/12/26 本页总览 DeepSeek-V3 正式发布
今天,我们全新系列模型 DeepSeek-V3 首个版本上线并同步开源。
登录官网 chat.deepseek.com 即可与最新版 V3 模型对话。API 服务已同步更新,接口配置无需改动。当前版本的 DeepSeek-V3 暂不支持多模
态输入输出。

性能对齐海外领军闭源模型
DeepSeek-V3 为自研 MoE 模型,671B 参数,激活 37B,在 14.8T token 上进行了预训练。
论文链接:https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
DeepSeek-V3 多项评测成绩超越了 Qwen2.5-72B 和 Llama-3.1-405B 等其他开源模型,并在性能上和世界顶尖的闭源模型 GPT-4o 以
及 Claude-3.5-Sonnet 不分伯仲。

百科知识: DeepSeek-V3 在知识类任务(MMLU, MMLU-Pro, GPQA, SimpleQA)上的水平相比前代 DeepSeek-V2.5 显著提升,


接近当前表现最好的模型 Claude-3.5-Sonnet-1022。
长文本: 在长文本测评中,DROP、FRAMES 和 LongBench v2 上,DeepSeek-V3 平均表现超越其他模型。
代码: DeepSeek-V3 在算法类代码场景(Codeforces),远远领先于市面上已有的全部非 o1 类模型;并在工程类代码场景(SWE-Bench
Verified)逼近 Claude-3.5-Sonnet-1022。
数学: 在美国数学竞赛(AIME 2024, MATH)和全国高中数学联赛(CNMO 2024)上,DeepSeek-V3 大幅超过了所有开源闭源模型。
中文能力: DeepSeek-V3 与 Qwen2.5-72B 在教育类测评 C-Eval 和代词消歧等评测集上表现相近,但在事实知识 C-SimpleQA 上更为领
先。

生成速度提升至 3 倍
通过算法和工程上的创新,DeepSeek-V3 的生成吐字速度从 20 TPS 大幅提高至 60 TPS,相比 V2.5 模型实现了 3 倍的提升,为用户带来更加迅速
流畅的使用体验。

API 服务价格调整
随着性能更强、速度更快的 DeepSeek-V3 更新上线,我们的模型 API 服务定价也将调整为每百万输入 tokens 0.5 元(缓存命中)/ 2 元(缓存未命
中),每百万输出 tokens 8 元,以期能够持续地为大家提供更好的模型服务。

与此同时,我们决定为全新模型设置长达 45 天的优惠价格体验期:即日起至 2025 年 2 月 8 日,DeepSeek-V3 的 API 服务价格仍然会是大家熟悉的


每百万输入 tokens 0.1 元(缓存命中)/ 1 元(缓存未命中),每百万输出 tokens 2 元,已经注册的老用户和在此期间内注册的新用户均可享受以上优惠价
格。

开源权重和本地部署
DeepSeek-V3 采用 FP8 训练,并开源了原生 FP8 权重。
得益于开源社区的支持,SGLang 和 LMDeploy 第一时间支持了 V3 模型的原生 FP8 推理,同时 TensorRT-LLM 和 MindIE 则实现了
BF16 推理。此外,为方便社区适配和拓展应用场景,我们提供了从 FP8 到 BF16 的转换脚本。
模型权重下载和更多本地部署信息请参考: https://huggingface.co/deepseek-ai/DeepSeek-V3-Base
“以开源精神和长期主义追求普惠 AGI” 是 DeepSeek 一直以来的坚定信念。我们非常兴奋能与社区分享在模型预训练方面的阶段性进展,也十分欣喜地看到开源模型和闭
源模型的能力差距正在进一步缩小。
这是一个全新的开始,未来我们会在 DeepSeek-V3 基座模型上继续打造深度思考、多模态等更加丰富的功能,并将持续与社区分享我们最新的探索成果。上一页
DeepSeek APP 下一页 DeepSeek V2 系列收官,联网搜索上线官网性能对齐海外领军闭源模型生成速度提升至 3 倍 API 服务价格调整开源权重和本地部署
微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news1210

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek-V2.5-1210 发布
2024/12/10 本页总览 DeepSeek V2 系列收官,联网搜索上线官网
今天,我们发布 DeepSeek V2.5 的最终版微调模型 DeepSeek-V2.5-1210。本版模型将是我们开启下一个全新基座模型系列前对 V2 系列的最后一
次更新。
与之前版本相比,本次更新通过 Post-Training 全面提升了模型各方面能力表现,包括数学、代码、写作、角色扮演等;同时,新版模型优化了文件上传功能,并且全新支
持了联网搜索,展现出更加强大的全方位服务于各类工作生活场景的能力。

模型通用能力提升
DeepSeek-V2.5-1210 版本通过 Post-Training 阶段的迭代,全面提升了模型在各个领域的能力:

遵循我们一贯的开源精神,新版模型权重已经开源在 Huggingface:
https://huggingface.co/deepseek-ai/DeepSeek-V2.5-1210

联网搜索功能
DeepSeek-V2.5-1210 版本支持了联网搜索功能,并已上线网页端。登陆 https://chat.deepseek.com/,在输入框中打开 “联网搜索”
即可体验。目前,API 暂不支持搜索功能。

在“联网搜索”模式下,模型将深入阅读海量网页,为用户生成全面、准确、满足个性化需求的回答。面对用户的复杂问题,模型将自动提取多个关键词并行搜索,在更短时间内提供更加多
样的搜索结果。
以下是搜索效果示例:

V2.5 的最后版本

DeepSeek-V2.5-1210 将会是 DeepSeek V2.5 模型的最后一个版本。随着本次模型的发布,我们 DeepSeek V2 系列模型的迭代更新也将


告一段落。
DeepSeek V2 系列模型自今年 5 月发布开源以来,已经陪伴大家走过了半年的时间,期间历经 5 次迭代,而广大用户朋友们的支持与肯定,正是我们一直以来坚持不断
更新进步的动力。
“有善始者实繁,能克终者盖寡。”
终版模型是暂时的收束,更是全新的起点。
DeepSeek 正在打造更加强大的下一代基座模型 DeepSeek V3,敬请期待!上一页 DeepSeek-V3 正式发布下一页 DeepSeek 推理模型预览版上线,
解密 o1 推理过程模型通用能力提升联网搜索功能 V2.5 的最后版本微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news1120

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek-R1-Lite 发布
2024/11/20 本页总览 DeepSeek 推理模型预览版上线,解密 o1 推理过程
今天,DeepSeek 全新研发的推理模型 DeepSeek-R1-Lite 预览版正式上线。
所有用户均可登录官方网页(chat.deepseek.com),一键开启与 R1-Lite 预览版模型的超强推理对话体验。
DeepSeek R1 系列模型使用强化学习训练,推理过程包含大量反思和验证,思维链长度可达数万字。
该系列模型在数学、代码以及各种复杂逻辑推理任务上,取得了媲美 o1-preview 的推理效果,并为用户展现了 o1 没有公开的完整思考过程。
全面提升的推理性能

DeepSeek-R1-Lite 预览版模型在美国数学竞赛(AMC)中难度等级最高的 AIME 以及全球顶级编程竞赛(codeforces)等权威评测中,均取得了卓越


的成绩,大幅超越了 GPT-4o 等知名模型。
下表为 DeepSeek-R1-Lite 在各项相关评测中的得分结果:

深度思考的效果与潜力
DeepSeek-R1-Lite 的推理过程长,并且包含了大量的反思和验证。下图展示了模型在数学竞赛上的得分与测试所允许思考的长度紧密相关。

红色实线展示了模型所能达到的准确率与所给定的推理长度呈正相关;
相比传统的多次采样+投票(Majority Voting),模型思维链长度增加展现出了更高的效率。

全面上线,尝鲜体验
登录 chat.deepseek.com,在输入框中选择“深度思考”模式,即可开启与 DeepSeek-R1-Lite 预览版的对话。
“深度思考” 模式专门针对数学、代码等各类复杂逻辑推理问题而设计,相比于普通的简单问题,能够提供更加全面、清晰、思路严谨的优质解答,充分展现出较长思维链的更多优势。

对话开启示例:

适用场景与效果示例:

新的开始,敬请期待
DeepSeek-R1-Lite 目前仍处于迭代开发阶段,仅支持网页使用,暂不支持 API 调用。DeepSeek-R1-Lite 所使用的也是一个较小的基座模型,无
法完全释放长思维链的潜力。
当前,我们正在持续迭代推理系列模型。之后,正式版 DeepSeek-R1 模型将完全开源,我们将公开技术报告,并部署 API 服务。
扫码与 DeepSeek 开启对话上一页 DeepSeek V2 系列收官,联网搜索上线官网下一页 DeepSeek-V2.5:融合通用与代码能力的全新开源模型全面提升
的推理性能深度思考的效果与潜力全面上线,尝鲜体验新的开始,敬请期待微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news0905

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 DeepSeek-V2.5 发布
2024/09/05 本页总览 DeepSeek-V2.5:融合通用与代码能力的全新开源模型
今天,我们完成了 DeepSeek-V2-Chat 和 DeepSeek-Coder-V2 两个模型的合并,正式发布 DeepSeek-V2.5。
DeepSeek-V2.5 不仅保留了原有 Chat 模型的通用对话能力和 Coder 模型的强大代码处理能力,还更好地对齐了人类偏好。此外,DeepSeek-
V2.5 在写作任务、指令跟随等多个方面也实现了大幅提升。
DeepSeek-V2.5 现已在网页端及 API 全面上线,API 接口向前兼容,用户通过 deepseek-coder 或 deepseek-chat 均可以访问新的
模型。同时,Function Calling、FIM 补全、Json Output 等功能保持不变。
All-in-One 的 DeepSeek-V2.5 将为用户带来更简洁、智能、高效的使用体验。
升级历史
DeepSeek 一直专注于模型的改进和优化。在 6 月份,我们对 DeepSeek-V2-Chat 进行了重大升级,用 Coder V2 的 Base 模型替换原有
的 Chat 的 Base 模型,显著提升了其代码生成和推理能力,并发布了 DeepSeek-V2-Chat-0628 版本。紧接着,DeepSeek-Coder-
V2 在原有 Base 模型的基础上,通过对齐优化,大大提升通用能力后推出了 DeepSeek-Coder-V2 0724 版本。最终,我们成功将 Chat 和
Coder 两个模型合并,推出了全新的 DeepSeek-V2.5 版本。

由于本次模型版本变动较大,如出现某些场景效果变差,建议重新调整 System Prompt 和 Temperature,以获得最佳性能。


通用能力

通用能力评测

首先,我们使用业界通用的测试集对 DeepSeek-V2.5 的能力进行测评,在中文和英文四个测试集上,DeepSeek-V2.5 均优于之前的 DeepSeek-


V2-0628 以及 DeepSeek-Coder-V2-0724。
在我们内部的中文评测中,和 GPT-4o mini、ChatGPT-4o-latest 的对战胜率(裁判为 GPT-4o)相较于 DeepSeek-V2-0628 均
有明显提升。此测评中涵盖创作、问答等通用能力,用户使用体验将得到提升:

安全能力评测

Safety 和 Helpful 之间的权衡是我们在迭代开发中一直重点关注的问题。在 DeepSeek-V2.5 版本中,我们对于模型安全问题的边界做了更加清晰的划分,


在强化模型对于各种越狱攻击的安全性的同时,减少了安全策略过度泛化到正常问题中去的倾向。
Model 安全综合得分(越高越好)*安全外溢比例(越低越好)**DeepSeek-V2-062874.4%11.3%DeepSeek-V2.582.6%4.6%
* 基于内部测试集合的得分,分数越高代表模型的整体安全性越高
** 基于内部测试集合的得分,比例越低代表模型的安全策略对于正常问题的影响越小
代码能力
在代码方面,DeepSeek-V2.5 保留了 DeepSeek-Coder-V2-0724 强大的代码能力。在 HumanEval Python 和
LiveCodeBench(2024 年 1 月 - 2024 年 9 月)测试中,DeepSeek-V2.5 显示了较为显著的改进。在 HumanEval
Multilingual 和 Aider 测试中,DeepSeek-Coder-V2-0724 略胜一筹。在 SWE-verified 测试中,两个版本的表现都较低,
表明在此方面仍需进一步优化。另外,在 FIM 补全任务上,内部评测集 DS-FIM-Eval 的评分提升了 5.1%,可以带来更好的插件补全体验。
另外,DeepSeek-V2.5 对代码常见场景进行了优化,以提升实际使用的表现。在内部的主观评测 DS-Arena-Code 中,DeepSeek-V2.5 对战竞
品的胜率(GPT-4o 为裁判)取得了显著提升。

模型开源
一如既往,秉持着持久开源的精神,DeepSeek-V2.5 现已开源到了 HuggingFace:
https://huggingface.co/deepseek-ai/DeepSeek-V2.5 上一页 DeepSeek 推理模型预览版上线,解密 o1 推理过程下一页
DeepSeek API 创新采用硬盘缓存,价格再降一个数量级升级历史通用能力代码能力模型开源微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news0802

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 API 上线硬盘缓存 2024/08/02 本页
总览 DeepSeek API 创新采用硬盘缓存,价格再降一个数量级
在大模型 API 的使用场景中,用户的输入有相当比例是重复的。举例说,用户的 prompt 往往有一些重复引用的部分;再举例说,多轮对话中,每一轮都要将前几轮的内容重
复输入。
为此,DeepSeek 启用上下文硬盘缓存技术,把预计未来会重复使用的内容,缓存在分布式的硬盘阵列中。如果输入存在重复,则重复的部分只需要从缓存读取,无需计算。该技术
不仅降低服务的延迟,还大幅削减最终的使用成本。
缓存命中的部分,DeepSeek 收费 0.1 元 每百万 tokens。至此,大模型的价格再降低一个数量级 1。

注 1: DeepSeek-V3 API 价格已做调整,最新价格请参考模型 & 价格页面。


如何使用 DeepSeek API 的缓存服务
硬盘缓存服务已经全面上线,用户无需修改代码,无需更换接口,硬盘缓存服务将自动运行,系统自动按照实际命中情况计费。
注意,只有当两个请求的前缀内容相同时(从第 0 个 token 开始相同),才算重复。中间开始的重复不能被缓存命中。
以下为两个经典场景的缓存举例:
1. 多轮对话场景,下一轮对话会命中上一轮对话生成的上下文缓存

2. 数据分析场景,后续具有相同前缀的请求会命中上下文缓存

多种应用能从上下文硬盘缓存中受益:

具有长预设提示词的问答助手类应用
具有长角色设定与多轮对话的角色扮演类应用
针对固定文本集合进行频繁询问的数据分析类应用
代码仓库级别的代码分析与排障工具
通过 Few-shot 提升模型输出效果
...

更详细的使用方法,请参考指南使用硬盘缓存
如何查询缓存命中情况
在 API 返回的 usage 中,增加了两个字段,帮助用户实时监测缓存的命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)


prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

降低服务延迟
输入长、重复内容多的请求,API 服务的首 token 延迟将大幅降低。
举个极端的例子,对 128K 输入且大部分重复的请求,实测首 token 延迟从 13 秒降低到 500 毫秒。
降低整体费用
最高可以节省 90% 的费用(需要针对缓存特性进行优化)。
即使不做任何优化,按历史使用情况,用户整体节省的费用也超过 50%。
缓存没有其它额外的费用,只有 0.1 元每百万 tokens。缓存占用存储无需付费。
缓存的安全性问题
本缓存系统在设计的时候已充分考虑了各种潜在的安全问题。
每个用户的缓存是独立的,逻辑上相互不可见,从底层确保用户数据的安全和隐私。
长时间不用的缓存会自动清空,不会长期保留,且不会用于其他用途。
为何 DeepSeek API 能率先采用硬盘缓存
根据公开的信息,DeepSeek 可能是全球第一家在 API 服务中大范围采用硬盘缓存的大模型厂商。
这得益于 DeepSeek V2 提出的 MLA 结构,在提高模型效果的同时,大大压缩了上下文 KV Cache 的大小,使得存储所需要的传输带宽和存储容量均大幅减少,
因此可以缓存到低成本的硬盘上。
DeepSeek API 的并发和限流
DeepSeek API 服务按照每天 1 万亿的容量进行设计。对所有用户均不限流、不限并发、同时保证服务质量。请放心加大并发使用。

注 1. 缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存


注 2. 缓存系统是“尽力而为”,不保证 100% 缓存命中
注 3. 缓存不再使用后会自动被清空,时间一般为几个小时到几天上一页 DeepSeek-V2.5:融合通用与代码能力的全新开源模型下一页 DeepSeek API 升级,
支持续写、FIM、Function Calling、JSON Output 如何使用 DeepSeek API 的缓存服务如何查询缓存命中情况降低服务延迟降低整体费用
缓存的安全性问题为何 DeepSeek API 能率先采用硬盘缓存 DeepSeek API 的并发和限流微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/news/news0725

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志新闻 API 升级新功能 2024/07/25 本页总
览 DeepSeek API 升级,支持续写、FIM、Function Calling、JSON Output
今天,DeepSeek API 迎来更新,装备了新的接口功能,来释放模型的更多潜力:

更新接口 /chat/completions

JSON Output
Function Calling
对话前缀续写(Beta)
8K 最长输出(Beta)

新增接口 /completions

FIM 补全(Beta)

所有新功能,均可使用 deepseek-chat 和 deepseek-coder 模型调用。

一、更新接口 /chat/completions
1. JSON Output,增强内容格式化
DeepSeek API 新增 JSON Output 功能,兼容 OpenAI API,能够强制模型输出 JSON 格式的字符串。
在进行数据处理等任务时,该功能可以让模型按预定格式返回 JSON,方便后续对模型输出内容进行解析,提高程序流程的自动化能力。
要使用 JSON Output 功能,需要:

设置 response_format 参数为 {'type': 'json_object'}


用户需要在提示词中,指导模型输出 JSON 的格式,来确保输出格式符合预期
合理设置 max_tokens,防止 JSON 字符串被中途截断

以下为一个 JSON Output 功能的使用样例。在这个样例中,用户给出一段文本,模型对文本中的问题&答案进行格式化输出。

详细使用方法,请参考 JSON Output 指南。

2. Function,连接物理世界
DeepSeek API 新增 Function Calling 功能,兼容 OpenAI API,通过调用外部工具,来增强模型与物理世界交互的能力。
Function Calling 功能支持传入多个 Function(最多 128 个),支持并行 Function 调用。
下图展示了将 deepseek-coder 整合到开源大模型前端 LobeChat 的效果。在这个例子中,我们开启了“网站爬虫”插件,来实现对网站的爬取和总结。

下图展示了使用 Function Calling 功能的交互过程:

详细使用方法,请参考 Function Calling 指南。

3. 对话前缀续写(Beta),更灵活的输出控制
对话前缀续写沿用了对话补全的 API 格式,允许用户指定最后一条 assistant 消息的前缀,来让模型按照该前缀进行补全。该功能也可用于输出长度达到
max_tokens 被截断后,将被截断的消息进行拼接,重新发送请求对被截断内容进行续写。
要使用对话前缀续写功能,需要:

设置 base_url 为 https://api.deepseek.com/beta 来开启 Beta 功能


确保 messages 列表里最后一条消息的 role 为 assistant,并设置最后一条消息的 prefix 参数为 True,如:{"role":
"assistant": "content": "在很久很久以前,", "prefix": True}

以下为对话前缀续写功能的使用样例。在这个例子里,设置了 assistant 消息开头为'```python\n',以强制其以代码块开始,并设置 stop 参数为 '`


``',让模型不输出多余的内容。
详细使用方法,请参考 对话前缀续写指南。

4. 8K 最长输出(Beta),释放更长可能
为了满足更长文本输出的场景,我们在 Beta 版 API 中,将 max_tokens 参数的上限调整为 8K。
要提高到 8K 最长输出,需要:

设置 base_url 为 https://api.deepseek.com/beta 来开启 Beta 功能


max_tokens 默认为 4096。开启 Beta 功能后,max_tokens 最大可设置为 8192

二、新增接口 /completions
1. FIM 补全(Beta),使能续写场景
DeepSeek API 新增 FIM (Fill-In-the-Middle) 补全接口,兼容 OpenAI 的 FIM 补全 API,允许用户提供自定义的前缀/后
缀(可选),让模型进行内容补全。该功能常用于故事续写、代码补全等场景。FIM 补全接口收费与对话补全相同。
要使用 FIM 补全接口,需要设置 base_url 为 https://api.deepseek.com/beta 来开启 Beta 功能。
以下为 FIM 补全接口的使用样例。在这个例子中,用户提供斐波那契数列函数的开头和结尾,模型对中间内容进行补全。

详细使用方法,请参考 FIM 补全指南。

更新说明
Beta 接口已开放给所有用户使用,用户需要设置 base_url 为 https://api.deepseek.com/beta 来开启 Beta 功能。
Beta 接口属于不稳定接口,后续测试、发布计划会灵活变动,敬请谅解。
相关模型版本,在功能稳定后会发布到开源社区,敬请期待。上一页 DeepSeek API 创新采用硬盘缓存,价格再降一个数量级下一页基本信息一、更新接口
/chat/completions 二、新增接口 /completions 更新说明微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/api/deepseek-api

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档基本信息对话(Chat)补全(Completions)模型(Model)其它 API 指南推理模型 (deepseek-
reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成
API 服务状态常见问题更新日志 API 文档基本信息 Version: 1.0.0
Deepseek API
使用 DeepSeek API 之前,请先 创建 API 密钥。
AuthenticationHTTP: Bearer AuthSecurity Scheme Type:httpHTTP Authorization
Scheme:bearer
ContactDeepSeek 技术支持: [email protected]
Terms of Servicehttps://platform.deepseek.com/downloads/DeepSeek 开放平台用户协议.html
LicenseMIT 上一页 DeepSeek API 升级,支持续写、FIM、Function Calling、JSON Output 下一页对话补全微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/reasoning_model

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南推理模型 (deepseek-
reasoner)本页总览推理模型 (deepseek-reasoner)
deepseek-reasoner 是 DeepSeek 推出的推理模型。在输出最终回答之前,模型会先输出一段思维链内容,以提升最终答案的准确性。我们的 API 向用
户开放 deepseek-reasoner 思维链的内容,以供用户查看、展示、蒸馏使用。
在使用 deepseek-reasoner 时,请先升级 OpenAI SDK 以支持新参数。
pip3 install -U openai
API 参数

输入参数:

max_tokens:最终回答的最大长度(不含思维链输出),默认为 4K,最大为 8K。请注意,思维链的输出最多可以达到 32K tokens,控思维链的长度的参数


(reasoning_effort)将会在近期上线。

输出字段:

reasoning_content:思维链内容,与 content 同级,访问方法见访问样例


content:最终回答内容

上下文长度:API 最大支持 64K 上下文,输出的 reasoning_content 长度不计入 64K 上下文长度中

支持的功能:对话补全,对话前缀续写 (Beta)

不支持的功能:Function Call、Json Output、FIM 补全 (Beta)

不支持的参数:temperature、top_p、presence_penalty、frequency_penalty、logprobs、top_logprobs。请
注意,为了兼容已有软件,设置 temperature、top_p、presence_penalty、frequency_penalty 参数不会报错,但也不会生效。设
置 logprobs、top_logprobs 会报错。

上下文拼接
在每一轮对话过程中,模型会输出思维链内容(reasoning_content)和最终回答(content)。在下一轮对话中,之前轮输出的思维链内容不会被拼接到上下文中,
如下图所示:

请注意,如果您在输入的 messages 序列中,传入了 reasoning_content,API 会返回 400 错误。因此,请删除 API 响应中的


reasoning_content 字段,再发起 API 请求,方法如访问样例所示。
访问样例
下面的代码以 Python 语言为例,展示了如何访问思维链和最终回答,以及如何在多轮对话中进行上下文拼接。

非流式流式 from openai import OpenAIclient = OpenAI(api_key="<DeepSeek API Key>",


base_url="https://api.deepseek.com")# Round 1messages = [{"role": "user",
"content": "9.11 and 9.8, which is greater?"}]response =
client.chat.completions.create( model="deepseek-reasoner",
messages=messages)reasoning_content =
response.choices[0].message.reasoning_contentcontent =
response.choices[0].message.content# Round 2messages.append({'role': 'assistant',
'content': content})messages.append({'role': 'user', 'content': "How many Rs are
there in the word 'strawberry'?"})response =
client.chat.completions.create( model="deepseek-reasoner",
messages=messages)# ...from openai import OpenAIclient = OpenAI(api_key="<DeepSeek
API Key>", base_url="https://api.deepseek.com")# Round 1messages = [{"role":
"user", "content": "9.11 and 9.8, which is greater?"}]response =
client.chat.completions.create( model="deepseek-reasoner", messages=messages,
stream=True)reasoning_content = ""content = ""for chunk in response: if
chunk.choices[0].delta.reasoning_content: reasoning_content +=
chunk.choices[0].delta.reasoning_content else: content +=
chunk.choices[0].delta.content# Round 2messages.append({"role": "assistant",
"content": content})messages.append({'role': 'user', 'content': "How many Rs are
there in the word 'strawberry'?"})response =
client.chat.completions.create( model="deepseek-reasoner", messages=messages,
stream=True)# ...上一页查询余额下一页多轮对话 API 参数上下文拼接访问样例微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/multi_round_chat

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南多轮对话多轮对话
本指南将介绍如何使用 DeepSeek /chat/completions API 进行多轮对话。
DeepSeek /chat/completions API 是一个“无状态” API,即服务端不记录用户请求的上下文,用户在每次请求时,需将之前所有对话历史拼接好后,
传递给对话 API。
下面的代码以 Python 语言,展示了如何进行上下文拼接,以实现多轮对话。
from openai import OpenAIclient = OpenAI(api_key="<DeepSeek API Key>",
base_url="https://api.deepseek.com")# Round 1messages = [{"role": "user",
"content": "What's the highest mountain in the world?"}]response =
client.chat.completions.create( model="deepseek-chat",
messages=messages)messages.append(response.choices[0].message)print(f"Messages
Round 1: {messages}")# Round 2messages.append({"role": "user", "content": "What is
the second?"})response = client.chat.completions.create( model="deepseek-chat",
messages=messages)messages.append(response.choices[0].message)print(f"Messages
Round 2: {messages}")

在第一轮请求时,传递给 API 的 messages 为:


[ {"role": "user", "content": "What's the highest mountain in the world?"}]
在第二轮请求时:

要将第一轮中模型的输出添加到 messages 末尾
将新的提问添加到 messages 末尾

最终传递给 API 的 messages 为:


[ {"role": "user", "content": "What's the highest mountain in the world?"},
{"role": "assistant", "content": "The highest mountain in the world is Mount
Everest."}, {"role": "user", "content": "What is the second?"}]上一页推理模型
(deepseek-reasoner)下一页对话前缀续写(Beta)微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/zh-cn/guides/chat_prefix_completion

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南对话前缀续写(Beta)本页总览对话前缀
续写(Beta)
对话前缀续写沿用 Chat Completion API,用户提供 assistant 开头的消息,来让模型补全其余的消息。
注意事项

使用对话前缀续写时,用户需确保 messages 列表里最后一条消息的 role 为 assistant,并设置最后一条消息的 prefix 参数为 True。


用户需要设置 base_url="https://api.deepseek.com/beta" 来开启 Beta 功能。

样例代码
下面给出了对话前缀续写的完整 Python 代码样例。在这个例子中,我们设置 assistant 开头的消息为 "```python\n" 来强制模型输出
python 代码,并设置 stop 参数为 ['```'] 来避免模型的额外解释。
from openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com/beta",)messages = [ {"role": "user",
"content": "Please write quick sort code"}, {"role": "assistant", "content":
"```python\n", "prefix": True}]response =
client.chat.completions.create( model="deepseek-chat", messages=messages,
stop=["```"],)print(response.choices[0].message.content)上一页多轮对话下一页 FIM 补全(Beta)注意事项
样例代码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/fim_completion

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南 FIM 补全(Beta)本页总览 FIM
补全(Beta)
在 FIM (Fill In the Middle) 补全中,用户可以提供前缀和后缀(可选),模型来补全中间的内容。FIM 常用于内容续写、代码补全等场景。
注意事项

模型的最大补全长度为 4K。
用户需要设置 base_url="https://api.deepseek.com/beta" 来开启 Beta 功能。

样例代码
下面给出了 FIM 补全的完整 Python 代码样例。在这个例子中,我们给出了计算斐波那契数列函数的开头和结尾,来让模型补全中间的内容。
from openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com/beta",)response =
client.completions.create( model="deepseek-chat", prompt="def fib(a):",
suffix=" return fib(a-1) + fib(a-2)",
max_tokens=128)print(response.choices[0].text)
配置 Continue 代码补全插件
Continue 是一款支持代码补全的 VSCode 插件,您可以参考这篇文档来配置 Continue 以使用代码补全功能。上一页对话前缀续写(Beta)下一页
JSON Output 注意事项样例代码配置 Continue 代码补全插件微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/json_mode

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南 JSON Output 本页总览 JSON
Output
在很多场景下,用户需要让模型严格按照 JSON 格式来输出,以实现输出的结构化,便于后续逻辑进行解析。
DeepSeek 提供了 JSON Output 功能,来确保模型输出合法的 JSON 字符串。
注意事项

设置 response_format 参数为 {'type': 'json_object'}。


用户传入的 system 或 user prompt 中必须含有 json 字样,并给出希望模型输出的 JSON 格式的样例,以指导模型来输出合法 JSON。
需要合理设置 max_tokens 参数,防止 JSON 字符串被中途截断。
在使用 JSON Output 功能时,API 有概率会返回空的 content。我们正在积极优化该问题,您可以尝试修改 prompt 以缓解此类问题。

样例代码
这里展示了使用 JSON Output 功能的完整 Python 代码:
import jsonfrom openai import OpenAIclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)system_prompt = """The user will provide some
exam text. Please parse the "question" and "answer" and output them in JSON format.
EXAMPLE INPUT: Which is the highest mountain in the world? Mount Everest.EXAMPLE
JSON OUTPUT:{ "question": "Which is the highest mountain in the world?",
"answer": "Mount Everest"}"""user_prompt = "Which is the longest river in the
world? The Nile River."messages = [{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}]response = client.chat.completions.create(
model="deepseek-chat", messages=messages, response_format={ 'type':
'json_object' })print(json.loads(response.choices[0].message.content))
模型将会输出:
{ "question": "Which is the longest river in the world?", "answer": "The Nile
River"}上一页 FIM 补全(Beta)下一页 Function Calling 注意事项样例代码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/function_calling

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南 Function Calling 本页总
览 Function Calling
Function Calling 让模型能够调用外部工具,来增强自身能力。
提示
当前版本 deepseek-chat 模型 Function Calling 功能效果不稳定,会出现循环调用、空回复的情况。我们正在积极修复中,预计将在下一个版本中得
到修复。
样例代码
这里以获取用户当前位置的天气信息为例,展示了使用 Function Calling 的完整 Python 代码。
Function Calling 的具体 API 格式请参考对话补全文档。
from openai import OpenAIdef send_messages(messages): response =
client.chat.completions.create( model="deepseek-chat",
messages=messages, tools=tools ) return
response.choices[0].messageclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)tools = [ { "type": "function",
"function": { "name": "get_weather", "description": "Get
weather of an location, the user shoud supply a location first",
"parameters": { "type": "object", "properties": {
"location": { "type": "string",
"description": "The city and state, e.g. San Francisco, CA", }
}, "required": ["location"] }, } },]messages =
[{"role": "user", "content": "How's the weather in Hangzhou?"}]message =
send_messages(messages)print(f"User>\t {messages[0]['content']}")tool =
message.tool_calls[0]messages.append(message)messages.append({"role": "tool",
"tool_call_id": tool.id, "content": "24℃"})message =
send_messages(messages)print(f"Model>\t {message.content}")
这个例子的执行流程如下:

用户:询问现在的天气
模型:返回 function get_weather({location: 'Hangzhou'})
用户:调用 function get_weather({location: 'Hangzhou'}),并传给模型。
模型:返回自然语言,"The current temperature in Hangzhou is 24°C."

注:上述代码中 get_weather 函数功能需由用户提供,模型本身不执行具体函数。上一页 JSON Output 下一页上下文硬盘缓存提示样例代码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天
上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信
公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/prompt-library/

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 提示库探索 DeepSeek 提示词样例,挖掘更多


可能微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/faq

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志常见问题本页总览常见问题

账号问题
账号无法登录
您账号近期的行为可能触发了我们的自动化风控策略,导致我们暂时关闭了您对账号的访问权限。如需申诉,请填写问卷,我们会尽快处理。
邮箱无法注册
如果您在注册时,收到错误提示“注册失败,暂不支持该邮箱域名注册。”,这是因为您的邮箱不被 DeepSeek 支持,请切换邮箱服务提供商。如仍有问题,请联系
[email protected]

企业认证
个人实名认证与企业实名认证有什么区别?
个人实名认证账号与企业实名认证账号在用户权益和产品功能上目前没有区别,但认证方式和所需材料有所不同。根据合规要求,请以您账号的实际使用情况进行认证。
企业实名账号可以更改为个人账号吗?
企业实名认证账号不可变更为个人认证或其他企业。

财务问题
如何充值?

在线充值:完成实名认证后,您可以在充值页使用支付宝/微信进行在线充值。您可以在账单页查询充值结果。

对公汇款:对公汇款仅支持企业用户。完成企业实名认证后,可获取专属汇款账号,向专属汇款账号进行打款。为保证汇款顺利进行,请确保汇款方开户名称与开放平台实名认证名称一致。
我方银行账户到账后,汇款金额将在 10 分钟- 1 小时左右自动转入您的开放平台账户,如未及时收到,请联系我们。

余额是否会过期?
充值余额不会失效过期。赠送余额的有效期您可以在账单页查看。
如何申请发票?
请访问账单页,点击发票管理,申请发票。企业用户开具发票时,发票抬头需要与实名认证信息一致,目前开发票的周期为 7 个工作日左右。
API 调用问题
调用模型时的并发限制是多少?是否可以提高账号的并发上限?
当前阶段,我们没有按照用户设置硬性并发上限。在系统总负载量较高时,基于系统负载和用户短时历史用量的动态限流模型可能会导致用户收到 503 或 429 错误码。
目前暂不支持针对单个账号提高并发上限,感谢您的理解。
为什么我感觉 API 返回比网页端慢
网页端默认使用流式输出(stream=true),即模型每输出一个字符,都会增量地显示在前端。
API 默认使用非流式输出(stream=false),即模型在所有内容输出完后,才会返回给用户。您可以通过开启 API 的 stream 模式来提升交互性。
为什么调用 API 时,持续返回空行?
为了保持 TCP 连接不会因超时中断,我们会在请求等待调度过程中,持续返回空行(非流式请求)或 SSE keep-alive 注释(: keep-alive,流式请
求)。如果您在自己解析 HTTP 响应,请注意处理这些空行或注释。
是否支持 LangChain?
支持。LangChain 支持 OpenAI API 接口,而 DeepSeek API 接口与 OpenAI 兼容。您可以下载以下代码文件并替换代码中的 API
Key,实现在 LangChain 中调用 DeepSeek API。
deepseek_langchain.py
如何离线计算 Tokens 用量?
请参考 Token 用量计算上一页上下文硬盘缓存下一页更新日志账号问题账号无法登录邮箱无法注册企业认证个人实名认证与企业实名认证有什么区别?企业实名账号可以更改为个人
账号吗?财务问题如何充值?余额是否会过期?如何申请发票?API 调用问题调用模型时的并发限制是多少?是否可以提高账号的并发上限?为什么我感觉 API 返回比网页端慢为
什么调用 API 时,持续返回空行?是否支持 LangChain?如何离线计算 Tokens 用量?微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/updates

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志更新日志本页总览更新日志

版本: 2025-01-20
deepseek-reasoner

deepseek-reasoner 是我们的新模型 DeepSeek-R1. 可以通过指定 model=deepseek-reasoner 调用。


详细更新,请参考: DeepSeek-R1 正式发布
调用指南,请参考: 推理模型

版本: 2024-12-26
deepseek-chat

deepseek-chat 模型升级为 DeepSeek-V3,接口不变,可以通过指定 model=deepseek-chat 调用。


详细更新,请参考:DeepSeek-V3 正式发布

版本:2024-12-10
deepseek-chat
deepseek-chat 模型升级为 DeepSeek-V2.5-1210,模型各项能力提升,相关基准测试:

数学能力:在 MATH-500 基准测试中的表现从 74.8% 提升至 82.8%


代码能力:在 LiveCodebench (08.01 - 12.01) 基准测试中的准确率从 29.2% 提升至 34.38%
中文写作与推理能力:在内部测试集中表现也有相应提升
与此同时,全新版本的模型对文件上传和网页总结功能的用户体验进行了优化。

版本:2024-09-05
deepseek-coder & deepseek-chat 升级为 DeepSeek V2.5 模型
DeepSeek V2 Chat 和 DeepSeek Coder V2 两个模型已经合并升级,升级后的新模型为 DeepSeek V2.5。
为向前兼容,API 用户通过 deepseek-coder 或 deepseek-chat 均可以访问新的模型。
新模型在通用能力、代码能力上,都显著超过了旧版本的两个模型。
新模型更好的对齐了人类的偏好,在写作任务、指令跟随等多方面进行了优化:

ArenaHard winrate 从 68.3% 提升至 76.3%


AlpacaEval 2.0 LC winrate 从 46.61% 提升至 50.52%
MT-Bench 分数从 8.84 提升至 9.02
AlignBench 分数从 7.88 提升至 8.04

新模型在原 Coder 模型的基础上进一步提升了代码生成能力,对常见编程应用场景进行了优化,并在标准测试集上取得了以下成绩:

HumanEval: 89%
LiveCodeBench (1-9 月): 41%

版本:2024-08-02
API 上线硬盘缓存技术
DeepSeek API 创新采用硬盘缓存,价格再降一个数量级
更新详情请跳转文档 API 上线硬盘缓存 2024/08/02

版本:2024-07-25
API 接口更新

更新接口 /chat/completions

JSON 输出
Function 调用
对话前缀续写(Beta)
8K 最长输出(Beta)

新增接口 /completions

FIM 补全(Beta)

更新详情请跳转文档 API 升级新功能 2024/07/25

版本:2024-07-24
deepseek-coder
deepseek-coder 模型升级为 DeepSeek-Coder-V2-0724。

版本:2024-06-28
deepseek-chat
deepseek-chat 模型升级为 DeepSeek-V2-0628,模型推理能力提升,相关基准测试:

代码,HumanEval Pass@1 79.88% -> 84.76%


数学,MATH ACC@1 55.02% -> 71.02%
推理,BBH 78.56% -> 83.40%

在 Arena-Hard 测评中,与 GPT-4-0314 的对战胜率从 41.6% 提升到了 68.3%。


模型角色扮演能力显著增强,可以在对话中按要求扮演不同角色。
版本:2024-06-14
deepseek-coder
deepseek-coder 模型升级为 DeepSeek-Coder-V2-0614,代码能力显著提升,在代码生成、代码理解、代码修复和代码补全上达到了 GPT-
4-Turbo-0409 的水平,并拥有卓越的数学和推理能力,其通用能力与 DeepSeek-V2-0517 持平。

版本:2024-05-17
deepseek-chat
deepseek-chat 模型升级为 DeepSeek-V2-0517,模型在指令跟随方面的性能得到了显著提升,IFEval Benchmark Prompt-
Level 准确率从 63.9% 跃升至 77.6%。此外,我们对 API 端的“system”区域指令跟随能力进行了优化,显著增强了沉浸式翻译、RAG 等任务的用户体
验。
模型对于 JSON 格式输出的准确性得到了提升。在内部测试集中,JSON 解析率从 78% 提高到了 85%。通过引入恰当的正则表达式,JSON 解析率进一步提高至
97%。上一页常见问题版本: 2025-01-20deepseek-reasoner 版本: 2024-12-26deepseek-chat 版本:2024-12-
10deepseek-chat 版本:2024-09-05deepseek-coder & deepseek-chat 升级为 DeepSeek V2.5 模型版本:
2024-08-02API 上线硬盘缓存技术版本:2024-07-25API 接口更新版本:2024-07-24deepseek-coder 版本:2024-06-
28deepseek-chat 版本:2024-06-14deepseek-coder 版本:2024-05-17deepseek-chat 微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/#%E8%B0%83%E7%94%A8%E5%AF%B9%E8%AF%9D-api

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志快速开始首次调用 API 本页总览首次调用 API
DeepSeek API 使用与 OpenAI 兼容的 API 格式,通过修改配置,您可以使用 OpenAI SDK 来访问 DeepSeek API,或使用与
OpenAI API 兼容的软件。
PARAMVALUEbase_url * https://api.deepseek.comapi_keyapply for an API key
* 出于与 OpenAI 兼容考虑,您也可以将 base_url 设置为 https://api.deepseek.com/v1 来使用,但注意,此处 v1 与模型版
本无关。* deepseek-chat 模型已全面升级为 DeepSeek-V3,接口不变。 通过指定 model='deepseek-chat' 即可调用
DeepSeek-V3。* deepseek-reasoner 是 DeepSeek 最新推出的推理模型 DeepSeek-R1。通过指定
model='deepseek-reasoner',即可调用 DeepSeek-R1。
调用对话 API
在创建 API key 之后,你可以使用以下样例脚本的来访问 DeepSeek API。样例为非流式输出,您可以将 stream 设置为 true 来使用流式输出。

curlpythonnodejscurl https://api.deepseek.com/chat/completions \ -H "Content-Type:


application/json" \ -H "Authorization: Bearer <DeepSeek API Key>" \ -d
'{ "model": "deepseek-chat", "messages": [ {"role":
"system", "content": "You are a helpful assistant."}, {"role": "user",
"content": "Hello!"} ], "stream": false }'# Please install
OpenAI SDK first: `pip3 install openai`from openai import OpenAIclient =
OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")response =
client.chat.completions.create( model="deepseek-chat",
messages=[ {"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"}, ],
stream=False)print(response.choices[0].message.content)// Please install OpenAI SDK
first: `npm install openai`import OpenAI from "openai";const openai = new OpenAI({
baseURL: 'https://api.deepseek.com', apiKey: '<DeepSeek API Key>'});async
function main() { const completion = await
openai.chat.completions.create({ messages: [{ role: "system", content: "You are
a helpful assistant." }], model: "deepseek-chat", });
console.log(completion.choices[0].message.content);}main();
下一页模型 & 价格调用对话 API 微信公众号
社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/
pricing#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartModels &
PricingOn this pageModels & Pricing
The prices listed below are in unites of per 1M tokens. A token, the smallest unit
of text that the model recognizes, can be a word, a number, or even a punctuation
mark. We will bill based on the total number of input and output tokens by the
model.
Pricing Details

USDCNYMODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX OUTPUT


TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K$0.07(5)$0.014$0.27(5)$0.14$1.10(5)$0.28deepseek-
reasoner64K32K8K$0.14$0.55$2.19 (6)MODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX
OUTPUT
TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K¥0.5(5)¥0.1¥2(5)¥1¥8(5)¥2deepseek-
reasoner64K32K8K¥1¥4¥16 (6)

(1) The deepseek-chat model has been upgraded to DeepSeek-V3. deepseek-reasoner


points to the new model DeepSeek-R1.
(2) CoT (Chain of Thought) is the reasoning content deepseek-reasoner gives before
output the final answer. For details, please refer to Reasoning Model。
(3) If max_tokens is not specified, the default maximum output length is 4K. Please
adjust max_tokens to support longer outputs.
(4) Please check DeepSeek Context Caching for the details of Context Caching.
(5) The form shows the the original price and the discounted price. From now until
2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API.
After that, it will recover to full price. DeepSeek-R1 is not included in the
discount.
(6) The output token count of deepseek-reasoner includes all tokens from CoT and
the final answer, and they are priced equally.

Deduction Rules
The expense = number of tokens × price.
The corresponding fees will be directly deducted from your topped-up balance or
granted balance, with a preference for using the granted balance first when both
balances are available.
Product prices may vary and DeepSeek reserves the right to adjust them. We
recommend topping up based on your actual usage and regularly checking this page
for the most recent pricing information.PreviousYour First API CallNextThe
Temperature ParameterPricing DetailsDeduction RulesWeChat Official Account
CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/pricing#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartModels &
PricingOn this pageModels & Pricing
The prices listed below are in unites of per 1M tokens. A token, the smallest unit
of text that the model recognizes, can be a word, a number, or even a punctuation
mark. We will bill based on the total number of input and output tokens by the
model.
Pricing Details

USDCNYMODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX OUTPUT


TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K$0.07(5)$0.014$0.27(5)$0.14$1.10(5)$0.28deepseek-
reasoner64K32K8K$0.14$0.55$2.19 (6)MODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX
OUTPUT
TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K¥0.5(5)¥0.1¥2(5)¥1¥8(5)¥2deepseek-
reasoner64K32K8K¥1¥4¥16 (6)

(1) The deepseek-chat model has been upgraded to DeepSeek-V3. deepseek-reasoner


points to the new model DeepSeek-R1.
(2) CoT (Chain of Thought) is the reasoning content deepseek-reasoner gives before
output the final answer. For details, please refer to Reasoning Model。
(3) If max_tokens is not specified, the default maximum output length is 4K. Please
adjust max_tokens to support longer outputs.
(4) Please check DeepSeek Context Caching for the details of Context Caching.
(5) The form shows the the original price and the discounted price. From now until
2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API.
After that, it will recover to full price. DeepSeek-R1 is not included in the
discount.
(6) The output token count of deepseek-reasoner includes all tokens from CoT and
the final answer, and they are priced equally.

Deduction Rules
The expense = number of tokens × price.
The corresponding fees will be directly deducted from your topped-up balance or
granted balance, with a preference for using the granted balance first when both
balances are available.
Product prices may vary and DeepSeek reserves the right to adjust them. We
recommend topping up based on your actual usage and regularly checking this page
for the most recent pricing information.PreviousYour First API CallNextThe
Temperature ParameterPricing DetailsDeduction RulesWeChat Official Account
CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/pricing#pricing-details

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartModels &
PricingOn this pageModels & Pricing
The prices listed below are in unites of per 1M tokens. A token, the smallest unit
of text that the model recognizes, can be a word, a number, or even a punctuation
mark. We will bill based on the total number of input and output tokens by the
model.
Pricing Details

USDCNYMODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX OUTPUT


TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K$0.07(5)$0.014$0.27(5)$0.14$1.10(5)$0.28deepseek-
reasoner64K32K8K$0.14$0.55$2.19 (6)MODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX
OUTPUT
TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K¥0.5(5)¥0.1¥2(5)¥1¥8(5)¥2deepseek-
reasoner64K32K8K¥1¥4¥16 (6)

(1) The deepseek-chat model has been upgraded to DeepSeek-V3. deepseek-reasoner


points to the new model DeepSeek-R1.
(2) CoT (Chain of Thought) is the reasoning content deepseek-reasoner gives before
output the final answer. For details, please refer to Reasoning Model。
(3) If max_tokens is not specified, the default maximum output length is 4K. Please
adjust max_tokens to support longer outputs.
(4) Please check DeepSeek Context Caching for the details of Context Caching.
(5) The form shows the the original price and the discounted price. From now until
2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API.
After that, it will recover to full price. DeepSeek-R1 is not included in the
discount.
(6) The output token count of deepseek-reasoner includes all tokens from CoT and
the final answer, and they are priced equally.

Deduction Rules
The expense = number of tokens × price.
The corresponding fees will be directly deducted from your topped-up balance or
granted balance, with a preference for using the granted balance first when both
balances are available.
Product prices may vary and DeepSeek reserves the right to adjust them. We
recommend topping up based on your actual usage and regularly checking this page
for the most recent pricing information.PreviousYour First API CallNextThe
Temperature ParameterPricing DetailsDeduction RulesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/quick_start/pricing#deduction-rules

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartModels &
PricingOn this pageModels & Pricing
The prices listed below are in unites of per 1M tokens. A token, the smallest unit
of text that the model recognizes, can be a word, a number, or even a punctuation
mark. We will bill based on the total number of input and output tokens by the
model.
Pricing Details

USDCNYMODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX OUTPUT


TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K$0.07(5)$0.014$0.27(5)$0.14$1.10(5)$0.28deepseek-
reasoner64K32K8K$0.14$0.55$2.19 (6)MODEL(1)CONTEXT LENGTHMAX COT TOKENS(2)MAX
OUTPUT
TOKENS(3)1M TOKENSINPUT PRICE(CACHE HIT) (4)1M TOKENSINPUT PRICE(CACHE MISS)1M TOKE
NSOUTPUT PRICEdeepseek-chat64K-8K¥0.5(5)¥0.1¥2(5)¥1¥8(5)¥2deepseek-
reasoner64K32K8K¥1¥4¥16 (6)

(1) The deepseek-chat model has been upgraded to DeepSeek-V3. deepseek-reasoner


points to the new model DeepSeek-R1.
(2) CoT (Chain of Thought) is the reasoning content deepseek-reasoner gives before
output the final answer. For details, please refer to Reasoning Model。
(3) If max_tokens is not specified, the default maximum output length is 4K. Please
adjust max_tokens to support longer outputs.
(4) Please check DeepSeek Context Caching for the details of Context Caching.
(5) The form shows the the original price and the discounted price. From now until
2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API.
After that, it will recover to full price. DeepSeek-R1 is not included in the
discount.
(6) The output token count of deepseek-reasoner includes all tokens from CoT and
the final answer, and they are priced equally.

Deduction Rules
The expense = number of tokens × price.
The corresponding fees will be directly deducted from your topped-up balance or
granted balance, with a preference for using the granted balance first when both
balances are available.
Product prices may vary and DeepSeek reserves the right to adjust them. We
recommend topping up based on your actual usage and regularly checking this page
for the most recent pricing information.PreviousYour First API CallNextThe
Temperature ParameterPricing DetailsDeduction RulesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/quick_start/
parameter_settings#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartThe
Temperature ParameterThe Temperature Parameter
The default value of temperature is 1.0.

We recommend users to set the temperature according to their use case listed in
below.

USE CASETEMPERATURECoding / Math 0.0Data Cleaning / Data Analysis1.0General


Conversation1.3Translation1.3Creative Writing / Poetry1.5PreviousModels &
PricingNextToken & Token UsageWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/parameter_settings#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartThe
Temperature ParameterThe Temperature Parameter
The default value of temperature is 1.0.

We recommend users to set the temperature according to their use case listed in
below.

USE CASETEMPERATURECoding / Math 0.0Data Cleaning / Data Analysis1.0General


Conversation1.3Translation1.3Creative Writing / Poetry1.5PreviousModels &
PricingNextToken & Token UsageWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/quick_start/
token_usage#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartToken &
Token UsageOn this pageToken & Token Usage
Tokens are the basic units used by models to represent natural language text, and
also the units we use for billing. They can be intuitively understood as
'characters' or 'words'. Typically, a Chinese word, an English word, a number, or a
symbol is counted as a token.
Generally, the conversion ratio between tokens in the model and the number of
characters is approximately as following:

1 English character ≈ 0.3 token.


1 Chinese character ≈ 0.6 token.

However, due to the different tokenization methods used by different models, the
conversion ratios can vary. The actual number of tokens processed each time is
based on the model's return, which you can view from the usage results.
Calculate token usage offline
You can run the demo tokenizer code in the following zip package to calculate the
token usage for your intput/output.
deepseek_v3_tokenizer.zipPreviousThe Temperature ParameterNextRate LimitCalculate
token usage offlineWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/token_usage#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartToken &
Token UsageOn this pageToken & Token Usage
Tokens are the basic units used by models to represent natural language text, and
also the units we use for billing. They can be intuitively understood as
'characters' or 'words'. Typically, a Chinese word, an English word, a number, or a
symbol is counted as a token.
Generally, the conversion ratio between tokens in the model and the number of
characters is approximately as following:

1 English character ≈ 0.3 token.


1 Chinese character ≈ 0.6 token.

However, due to the different tokenization methods used by different models, the
conversion ratios can vary. The actual number of tokens processed each time is
based on the model's return, which you can view from the usage results.
Calculate token usage offline
You can run the demo tokenizer code in the following zip package to calculate the
token usage for your intput/output.
deepseek_v3_tokenizer.zipPreviousThe Temperature ParameterNextRate LimitCalculate
token usage offlineWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/token_usage#calculate-token-usage-offline

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartToken &
Token UsageOn this pageToken & Token Usage
Tokens are the basic units used by models to represent natural language text, and
also the units we use for billing. They can be intuitively understood as
'characters' or 'words'. Typically, a Chinese word, an English word, a number, or a
symbol is counted as a token.
Generally, the conversion ratio between tokens in the model and the number of
characters is approximately as following:

1 English character ≈ 0.3 token.


1 Chinese character ≈ 0.6 token.

However, due to the different tokenization methods used by different models, the
conversion ratios can vary. The actual number of tokens processed each time is
based on the model's return, which you can view from the usage results.
Calculate token usage offline
You can run the demo tokenizer code in the following zip package to calculate the
token usage for your intput/output.
deepseek_v3_tokenizer.zipPreviousThe Temperature ParameterNextRate LimitCalculate
token usage offlineWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/
rate_limit#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartRate
LimitRate Limit
DeepSeek API does NOT constrain user's rate limit. We will try out best to serve
every request.
However, please note that when our servers are under high traffic pressure, your
requests may take some time to receive a response from the server. During this
period, your HTTP request will remain connected, and you may continuously receive
contents in the following formats:

Non-streaming requests: Continuously return empty lines


Streaming requests: Continuously return SSE keep-alive comments (: keep-alive)

These contents do not affect the parsing of the JSON body by the OpenAI SDK. If you
are parsing the HTTP responses yourself, please ensure to handle these empty lines
or comments appropriately.
If the request is still not completed after 30 minutes, the server will close the
connection.PreviousToken & Token UsageNextError CodesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/rate_limit#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartRate
LimitRate Limit
DeepSeek API does NOT constrain user's rate limit. We will try out best to serve
every request.
However, please note that when our servers are under high traffic pressure, your
requests may take some time to receive a response from the server. During this
period, your HTTP request will remain connected, and you may continuously receive
contents in the following formats:

Non-streaming requests: Continuously return empty lines


Streaming requests: Continuously return SSE keep-alive comments (: keep-alive)

These contents do not affect the parsing of the JSON body by the OpenAI SDK. If you
are parsing the HTTP responses yourself, please ensure to handle these empty lines
or comments appropriately.
If the request is still not completed after 30 minutes, the server will close the
connection.PreviousToken & Token UsageNextError CodesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/
error_codes#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartError
CodesError Codes
When calling DeepSeek API, you may encounter errors. Here list the causes and
solutions.
CODE DESCRIPTION400 - Invalid FormatCause:
Invalid request body format. Solution: Please modify your request body according
to the hints in the error message. For more API format details, please refer to
DeepSeek API Docs.401 - Authentication FailsCause: Authentication fails due to the
wrong API key. Solution: Please check your API key. If you don't have one, please
create an API key first.402 - Insufficient BalanceCause: You have run out of
balance. Solution: Please check your account's balance, and go to the Top up page
to add funds.422 - Invalid ParametersCause: Your request contains invalid
parameters. Solution: Please modify your request parameters according to the hints
in the error message. For more API format details, please refer to DeepSeek API
Docs.429 - Rate Limit ReachedCause: You are sending requests too quickly.
Solution: Please pace your requests reasonably. We also advise users to temporarily
switch to the APIs of alternative LLM service providers, like
OpenAI.500 - Server ErrorCause: Our server encounters an issue. Solution: Please
retry your request after a brief wait and contact us if the issue
persists.503 - Server OverloadedCause: The server is overloaded due to high
traffic. Solution: Please retry your request after a brief wait.PreviousRate
LimitNextDeepSeek-R1 ReleaseWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/quick_start/error_codes#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogQuick StartError
CodesError Codes
When calling DeepSeek API, you may encounter errors. Here list the causes and
solutions.
CODE DESCRIPTION400 - Invalid FormatCause:
Invalid request body format. Solution: Please modify your request body according
to the hints in the error message. For more API format details, please refer to
DeepSeek API Docs.401 - Authentication FailsCause: Authentication fails due to the
wrong API key. Solution: Please check your API key. If you don't have one, please
create an API key first.402 - Insufficient BalanceCause: You have run out of
balance. Solution: Please check your account's balance, and go to the Top up page
to add funds.422 - Invalid ParametersCause: Your request contains invalid
parameters. Solution: Please modify your request parameters according to the hints
in the error message. For more API format details, please refer to DeepSeek API
Docs.429 - Rate Limit ReachedCause: You are sending requests too quickly.
Solution: Please pace your requests reasonably. We also advise users to temporarily
switch to the APIs of alternative LLM service providers, like
OpenAI.500 - Server ErrorCause: Our server encounters an issue. Solution: Please
retry your request after a brief wait and contact us if the issue
persists.503 - Server OverloadedCause: The server is overloaded due to high
traffic. Solution: Please retry your request after a brief wait.PreviousRate
LimitNextDeepSeek-R1 ReleaseWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250120#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1
Release 2025/01/20DeepSeek-R1 Release

⚡ Performance on par with OpenAI-o1

📖 Fully open-source model & technical report

🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!
🔥 Bonus: Open-Source Distilled Models!

🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced

📏 32B & 70B models on par with OpenAI-o1-mini

🤝 Empowering the open-source community

🌍 Pushing the boundaries of open AI!

📜 License Update!

🔄 DeepSeek-R1 is now MIT licensed for clear open access

🔓 Open for the community to leverage model weights & outputs

API outputs can now be used for fine-tuning & distillation

DeepSeek-R1: Technical Highlights

📈 Large-scale RL in post-training

🏆 Significant performance boost with minimal labeled data

🔢 Math, code, and reasoning tasks on par with OpenAI-o1

📄 More details:
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf
🌐 API Access & Pricing

⚙️ Use DeepSeek-R1 by setting model=deepseek-reasoner

💰 $0.14 / million input tokens (cache hit)

💰 $0.55 / million input tokens (cache miss)

💰 $2.19 / million output tokens

📖 API guide: https://api-docs.deepseek.com/guides/reasoning_model

PreviousError CodesNextIntroducing DeepSeek AppWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250120#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1
Release 2025/01/20DeepSeek-R1 Release

⚡ Performance on par with OpenAI-o1

📖 Fully open-source model & technical report

🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!
🔥 Bonus: Open-Source Distilled Models!

🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced

📏 32B & 70B models on par with OpenAI-o1-mini

🤝 Empowering the open-source community

🌍 Pushing the boundaries of open AI!

📜 License Update!

🔄 DeepSeek-R1 is now MIT licensed for clear open access

🔓 Open for the community to leverage model weights & outputs

API outputs can now be used for fine-tuning & distillation

DeepSeek-R1: Technical Highlights

📈 Large-scale RL in post-training

🏆 Significant performance boost with minimal labeled data

🔢 Math, code, and reasoning tasks on par with OpenAI-o1

📄 More details:
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

🌐 API Access & Pricing

⚙️ Use DeepSeek-R1 by setting model=deepseek-reasoner


💰 $0.14 / million input tokens (cache hit)

💰 $0.55 / million input tokens (cache miss)

💰 $2.19 / million output tokens

📖 API guide: https://api-docs.deepseek.com/guides/reasoning_model

PreviousError CodesNextIntroducing DeepSeek AppWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250115#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek APP
2025/01/15On this pageIntroducing DeepSeek App

💡 Powered by world-class DeepSeek-V3

🆓 FREE to use with seamless interaction

📱 Now officially available on App Store & Google Play & Major Android markets

🔗Download now: https://download.deepseek.com/app/

Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID

☁️ Cross-platform chat history sync


🔍 Web search & Deep-Think mode

📄 File upload & text extraction

Important Notice:

✅ 100% FREE - No ads, no in-app purchases

Download only from official channels to avoid being misled

📲 Search "DeepSeek" in your app store or visit our website for direct links

PreviousDeepSeek-R1 ReleaseNext🚀 Introducing DeepSeek-V3Key Features of DeepSeek


App:Important Notice:WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250115#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek APP
2025/01/15On this pageIntroducing DeepSeek App

💡 Powered by world-class DeepSeek-V3

🆓 FREE to use with seamless interaction

📱 Now officially available on App Store & Google Play & Major Android markets

🔗Download now: https://download.deepseek.com/app/


Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID

☁️ Cross-platform chat history sync

🔍 Web search & Deep-Think mode

📄 File upload & text extraction

Important Notice:

✅ 100% FREE - No ads, no in-app purchases

Download only from official channels to avoid being misled

📲 Search "DeepSeek" in your app store or visit our website for direct links

PreviousDeepSeek-R1 ReleaseNext🚀 Introducing DeepSeek-V3Key Features of DeepSeek


App:Important Notice:WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250115#key-features-of-deepseek-app

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek APP
2025/01/15On this pageIntroducing DeepSeek App

💡 Powered by world-class DeepSeek-V3


🆓 FREE to use with seamless interaction

📱 Now officially available on App Store & Google Play & Major Android markets

🔗Download now: https://download.deepseek.com/app/

Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID

☁️ Cross-platform chat history sync

🔍 Web search & Deep-Think mode

📄 File upload & text extraction

Important Notice:

✅ 100% FREE - No ads, no in-app purchases

Download only from official channels to avoid being misled

📲 Search "DeepSeek" in your app store or visit our website for direct links

PreviousDeepSeek-R1 ReleaseNext🚀 Introducing DeepSeek-V3Key Features of DeepSeek


App:Important Notice:WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news250115#important-notice

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek APP
2025/01/15On this pageIntroducing DeepSeek App

💡 Powered by world-class DeepSeek-V3

🆓 FREE to use with seamless interaction

📱 Now officially available on App Store & Google Play & Major Android markets

🔗Download now: https://download.deepseek.com/app/

Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID

☁️ Cross-platform chat history sync

🔍 Web search & Deep-Think mode

📄 File upload & text extraction

Important Notice:

✅ 100% FREE - No ads, no in-app purchases

Download only from official channels to avoid being misled

📲 Search "DeepSeek" in your app store or visit our website for direct links

PreviousDeepSeek-R1 ReleaseNext🚀 Introducing DeepSeek-V3Key Features of DeepSeek


App:Important Notice:WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------
https://api-docs.deepseek.com/news/news1226#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.


------- • -------

https://api-docs.deepseek.com/news/news1226#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account
CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1226#biggest-leap-forward-yet

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1226#-whats-new-in-v3

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1226#-api-pricing-update

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥

🌌 Open-source spirit + Longtermism to inclusive AGI


🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1226#still-the-best-value-in-the-market-

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsIntroducing
DeepSeek-V3 2024/12/26On this page🚀 Introducing DeepSeek-V3
Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)


💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🎉 What’s new in V3

🧠 671B MoE parameters


🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:

Model 👉 https://github.com/deepseek-ai/DeepSeek-V3
Paper 👉 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

💰 API Pricing Update

🎉 Until Feb 8: same as V2!


🤯 From Feb 8 onwards:

Input (cache miss)Input (cache hit)Output$0.27/M tokens$0.07/M tokens$1.10/M tokens


Still the best value in the market! 🔥
🌌 Open-source spirit + Longtermism to inclusive AGI
🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the
community and see the gap between open and closed models narrowing.
🚀 This is just the beginning! Look forward to multimodal support and other cutting-
edge features in the DeepSeek ecosystem.
💡 Together, let’s push the boundaries of innovation!
PreviousIntroducing DeepSeek AppNext🚀 DeepSeek V2.5: The Grand Finale 🎉Biggest leap
forward yet🎉 What’s new in V3💰 API Pricing UpdateStill the best value in the
market! 🔥WeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1210#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-V2.5-
1210 Release 2024/12/10🚀 DeepSeek V2.5: The Grand Finale 🎉
🌐 Internet Search is now live on the web! Visit https://chat.deepseek.com/ and
toggle “Internet Search” for real-time answers. 🕒

📊 DeepSeek-V2.5-1210 raises the bar across benchmarks like math, coding, writing,
and roleplay—built to serve all your work and life needs.
🔧 Explore the open-source model on Hugging Face: https://huggingface.co/deepseek-
ai/DeepSeek-V2.5-1210

🙌 With the release of DeepSeek-V2.5-1210, the V2.5 series comes to an end.


💪 Since May, the DeepSeek V2 series has brought 5 impactful updates, earning your
trust and support along the way.
✨ As V2 closes, it’s not the end—it’s the beginning of something greater. DeepSeek
is working on next-gen foundation models to push boundaries even further. Stay
tuned!
“Every end is a new beginning.” Previous🚀 Introducing DeepSeek-V3Next🚀 DeepSeek-R1-
Lite-Preview is now live: unleashing supercharged reasoning power!WeChat Official
Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1210#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-V2.5-
1210 Release 2024/12/10🚀 DeepSeek V2.5: The Grand Finale 🎉
🌐 Internet Search is now live on the web! Visit https://chat.deepseek.com/ and
toggle “Internet Search” for real-time answers. 🕒

📊 DeepSeek-V2.5-1210 raises the bar across benchmarks like math, coding, writing,
and roleplay—built to serve all your work and life needs.
🔧 Explore the open-source model on Hugging Face: https://huggingface.co/deepseek-
ai/DeepSeek-V2.5-1210

🙌 With the release of DeepSeek-V2.5-1210, the V2.5 series comes to an end.


💪 Since May, the DeepSeek V2 series has brought 5 impactful updates, earning your
trust and support along the way.
✨ As V2 closes, it’s not the end—it’s the beginning of something greater. DeepSeek
is working on next-gen foundation models to push boundaries even further. Stay
tuned!
“Every end is a new beginning.” Previous🚀 Introducing DeepSeek-V3Next🚀 DeepSeek-R1-
Lite-Preview is now live: unleashing supercharged reasoning power!WeChat Official
Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1120#__docusaurus_skipToContent_fallback

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1-Lite
Release 2024/11/20🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged
reasoning power!
🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
Open-source models & API coming soon!
🌐 Try it now at http://chat.deepseek.com

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview


Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score
improvements on AIME as thought length increases.
Previous🚀 DeepSeek V2.5: The Grand Finale 🎉NextDeepSeek-V2.5: A New Open-Source
Model Combining General and Coding CapabilitiesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/news/news1120#

Skip to main contentDeepSeek API DocsEnglishEnglish 中文(中国)DeepSeek PlatformQuick


StartYour First API CallModels & PricingThe Temperature ParameterToken & Token
UsageRate LimitError CodesNewsDeepSeek-R1 Release 2025/01/20DeepSeek APP
2025/01/15Introducing DeepSeek-V3 2024/12/26DeepSeek-V2.5-1210 Release
2024/12/10DeepSeek-R1-Lite Release 2024/11/20DeepSeek-V2.5 Release
2024/09/05Context Caching is Available 2024/08/02New API Features 2024/07/25API
ReferenceAPI GuidesReasoning Model (deepseek-reasoner)Multi-round ConversationChat
Prefix Completion (Beta)FIM Completion (Beta)JSON OutputFunction CallingContext
CachingOther ResourcesIntegrationsAPI Status PageFAQChange LogNewsDeepSeek-R1-Lite
Release 2024/11/20🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged
reasoning power!
🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
Open-source models & API coming soon!
🌐 Try it now at http://chat.deepseek.com

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview


Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score
improvements on AIME as thought length increases.
Previous🚀 DeepSeek V2.5: The Grand Finale 🎉NextDeepSeek-V2.5: A New Open-Source
Model Combining General and Coding CapabilitiesWeChat Official Account

CommunityEmailDiscordTwitterMoreGitHubCopyright © 2025 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/function_calling#%E6%8F%90%E7%A4%BA

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南 Function Calling 本页总
览 Function Calling
Function Calling 让模型能够调用外部工具,来增强自身能力。
提示
当前版本 deepseek-chat 模型 Function Calling 功能效果不稳定,会出现循环调用、空回复的情况。我们正在积极修复中,预计将在下一个版本中得
到修复。
样例代码
这里以获取用户当前位置的天气信息为例,展示了使用 Function Calling 的完整 Python 代码。
Function Calling 的具体 API 格式请参考对话补全文档。
from openai import OpenAIdef send_messages(messages): response =
client.chat.completions.create( model="deepseek-chat",
messages=messages, tools=tools ) return
response.choices[0].messageclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)tools = [ { "type": "function",
"function": { "name": "get_weather", "description": "Get
weather of an location, the user shoud supply a location first",
"parameters": { "type": "object", "properties": {
"location": { "type": "string",
"description": "The city and state, e.g. San Francisco, CA", }
}, "required": ["location"] }, } },]messages =
[{"role": "user", "content": "How's the weather in Hangzhou?"}]message =
send_messages(messages)print(f"User>\t {messages[0]['content']}")tool =
message.tool_calls[0]messages.append(message)messages.append({"role": "tool",
"tool_call_id": tool.id, "content": "24℃"})message =
send_messages(messages)print(f"Model>\t {message.content}")
这个例子的执行流程如下:

用户:询问现在的天气
模型:返回 function get_weather({location: 'Hangzhou'})
用户:调用 function get_weather({location: 'Hangzhou'}),并传给模型。
模型:返回自然语言,"The current temperature in Hangzhou is 24°C."

注:上述代码中 get_weather 函数功能需由用户提供,模型本身不执行具体函数。上一页 JSON Output 下一页上下文硬盘缓存提示样例代码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/function_calling#%E6%A0%B7%E4%BE%8B
%E4%BB%A3%E7%A0%81

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南 Function Calling 本页总
览 Function Calling
Function Calling 让模型能够调用外部工具,来增强自身能力。
提示
当前版本 deepseek-chat 模型 Function Calling 功能效果不稳定,会出现循环调用、空回复的情况。我们正在积极修复中,预计将在下一个版本中得
到修复。
样例代码
这里以获取用户当前位置的天气信息为例,展示了使用 Function Calling 的完整 Python 代码。
Function Calling 的具体 API 格式请参考对话补全文档。
from openai import OpenAIdef send_messages(messages): response =
client.chat.completions.create( model="deepseek-chat",
messages=messages, tools=tools ) return
response.choices[0].messageclient = OpenAI( api_key="<your api key>",
base_url="https://api.deepseek.com",)tools = [ { "type": "function",
"function": { "name": "get_weather", "description": "Get
weather of an location, the user shoud supply a location first",
"parameters": { "type": "object", "properties": {
"location": { "type": "string",
"description": "The city and state, e.g. San Francisco, CA", }
}, "required": ["location"] }, } },]messages =
[{"role": "user", "content": "How's the weather in Hangzhou?"}]message =
send_messages(messages)print(f"User>\t {messages[0]['content']}")tool =
message.tool_calls[0]messages.append(message)messages.append({"role": "tool",
"tool_call_id": tool.id, "content": "24℃"})message =
send_messages(messages)print(f"Model>\t {message.content}")
这个例子的执行流程如下:

用户:询问现在的天气
模型:返回 function get_weather({location: 'Hangzhou'})
用户:调用 function get_weather({location: 'Hangzhou'}),并传给模型。
模型:返回自然语言,"The current temperature in Hangzhou is 24°C."

注:上述代码中 get_weather 函数功能需由用户提供,模型本身不执行具体函数。上一页 JSON Output 下一页上下文硬盘缓存提示样例代码微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/
kv_cache#__docusaurus_skipToContent_fallback

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E4%BE%8B%E4%B8%80%E9%95%BF
%E6%96%87%E6%9C%AC%E9%97%AE%E7%AD%94

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E4%BE%8B%E4%BA%8C%E5%A4%9A
%E8%BD%AE%E5%AF%B9%E8%AF%9D

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E4%BE%8B%E4%B8%89%E4%BD%BF
%E7%94%A8-few-shot-%E5%AD%A6%E4%B9%A0

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E6%9F%A5%E7%9C%8B%E7%BC
%93%E5%AD%98%E5%91%BD%E4%B8%AD%E6%83%85%E5%86%B5

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明
缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E7%A1%AC%E7%9B%98%E7%BC
%93%E5%AD%98%E4%B8%8E%E8%BE%93%E5%87%BA%E9%9A%8F%E6%9C%BA%E6%80%A7

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明

缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/guides/kv_cache#%E5%85%B6%E5%AE%83%E8%AF
%B4%E6%98%8E

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 快速开始首次调用 API 模型 & 价格


Temperature 设置 Token 用量计算限速错误码新闻 DeepSeek-R1 发布 2025/01/20DeepSeek APP 发布
2025/01/15DeepSeek-V3 发布 2024/12/26DeepSeek-V2.5-1210 发布 2024/12/10DeepSeek-R1-Lite
发布 2024/11/20DeepSeek-V2.5 发布 2024/09/05API 上线硬盘缓存 2024/08/02API 升级新功能
2024/07/25API 文档 API 指南推理模型 (deepseek-reasoner)多轮对话对话前缀续写(Beta)FIM 补全(Beta)JSON
OutputFunction Calling 上下文硬盘缓存提示库其它资源实用集成 API 服务状态常见问题更新日志 API 指南上下文硬盘缓存本页总览上下文硬盘缓存
DeepSeek API 上下文硬盘缓存技术对所有用户默认开启,用户无需修改代码即可享用。
用户的每一个请求都会触发硬盘缓存的构建。若后续请求与之前的请求在前缀上存在重复,则重复部分只需要从缓存中拉取,计入“缓存命中”。
注意:两个请求间,只有重复的前缀部分才能触发“缓存命中”,详间下面的例子。

例一:长文本问答
第一次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请总结一下这份财报的关键信息。"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位资深的财报分析师..."} {"role": "user",
"content": "<财报内容>\n\n 请分析一下这份财报的盈利情况。"}]
在上例中,两次请求都有相同的前缀,即 system 消息 + user 消息中的 <财报内容>。在第二次请求时,这部分前缀会计入“缓存命中”。

例二:多轮对话
第一次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位乐于助人的助手"}, {"role": "user",
"content": "中国的首都是哪里?"}, {"role": "assistant", "content": "中国的首都是北京。"},
{"role": "user", "content": "美国的首都是哪里?"}]
在上例中,第二次请求可以复用第一次请求开头的 system 消息和 user 消息,这部分会计入“缓存命中”。

例三:使用 Few-shot 学习
在实际应用中,用户可以通过 Few-shot 学习的方式,来提升模型的输出效果。所谓 Few-shot 学习,是指在请求中提供一些示例,让模型学习到特定的模式。由于
Few-shot 一般提供相同的上下文前缀,在硬盘缓存的加持下,Few-shot 的费用显著降低。
第一次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问清朝的开国皇帝是谁?"}]
第二次请求
messages: [ {"role": "system", "content": "你是一位历史学专家,用户将提供一系列问题,你的回答应当简明
扼要,并以`Answer:`开头"}, {"role": "user", "content": "请问秦始皇统一六国是在哪一年?"},
{"role": "assistant", "content": "Answer:公元前 221 年"}, {"role": "user",
"content": "请问汉朝的建立者是谁?"}, {"role": "assistant", "content": "Answer:刘邦"},
{"role": "user", "content": "请问唐朝最后一任皇帝是谁"}, {"role": "assistant",
"content": "Answer:李柷"}, {"role": "user", "content": "请问明朝的开国皇帝是谁?"},
{"role": "assistant", "content": "Answer:朱元璋"}, {"role": "user", "content":
"请问商朝是什么时候灭亡的"}, ]
在上例中,使用了 4-shots。两次请求只有最后一个问题不一样,第二次请求可以复用第一次请求中前 4 轮对话的内容,这部分会计入“缓存命中”。

查看缓存命中情况
在 DeepSeek API 的返回中,我们在 usage 字段中增加了两个字段,来反映请求的缓存命中情况:

prompt_cache_hit_tokens:本次请求的输入中,缓存命中的 tokens 数(0.1 元 / 百万 tokens)

prompt_cache_miss_tokens:本次请求的输入中,缓存未命中的 tokens 数(1 元 / 百万 tokens)

硬盘缓存与输出随机性
硬盘缓存只匹配到用户输入的前缀部分,输出仍然是通过计算推理得到的,仍然受到 temperature 等参数的影响,从而引入随机性。其输出效果与不使用硬盘缓存相同。
其它说明
缓存系统以 64 tokens 为一个存储单元,不足 64 tokens 的内容不会被缓存

缓存系统是“尽力而为”,不保证 100% 缓存命中

缓存构建耗时为秒级。缓存不再使用后会自动被清空,时间一般为几个小时到几天

上一页 Function Calling 下一页常见问题例一:长文本问答例二:多轮对话例三:使用 Few-shot 学习查看缓存命中情况硬盘缓存与输出随机性其它说明微信


公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/prompt-library/
#__docusaurus_skipToContent_fallback

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 提示库探索 DeepSeek 提示词样例,挖掘更多


可能微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

https://api-docs.deepseek.com/zh-cn/prompt-library/#

跳到主要内容 DeepSeek API 文档中文(中国)English 中文(中国)DeepSeek Platform 提示库探索 DeepSeek 提示词样例,挖掘更多


可能微信公众号

社区邮箱 DiscordTwitter 更多 GitHubCopyright © 2024 DeepSeek, Inc.

------- • -------

You might also like