Perplexity

1. Overview

Perplexity AI is an AI-powered conversational search engine designed to provide users with direct and accurate answers through natural language processing.

Available model list:

  • sonar

  • sonar-pro

  • sonar-reasoning

  • sonar-reasoning-pro

  • llama-3.1-sonar-small-128k-online (Discontinued on 2025/02/22)

  • llama-3.1-sonar-large-128k-online (Discontinued on 2025/02/22)

  • llama-3.1-sonar-huge-128k-online (Discontinued on 2025/02/22)

Note

This API is compatible with the OpenAI interface format.

2. Request Description

  • Request method: POST

  • Request address: https://gateway.theturbo.ai/v1/chat/completions

3. Input Parameters

3.1 Header Parameters

Parameter Name
Type
Required
Description
Example Value

Content-Type

string

Yes

Set the request header type, which must be application/json

application/json

Accept

string

Yes

Set the response type, which is recommended to be unified as application/json

application/json

Authorization

string

Yes

API_KEY required for authentication. Format: Bearer $YOUR_API_KEY

Bearer $YOUR_API_KEY

3.2 Body Parameters (application/json)

Parameter Name
Type
Required
Description
Example

model

string

Yes

The model ID to use. See available models listed in the Overview for details, such as llama-3.1-sonar-small-128k-online.

llama-3.1-sonar-small-128k-online

messages

array

Yes

Chat message list, compatible with OpenAI interface format. Each object in the array contains role and content.

[{"role": "user","content": "hello"}]

role

string

No

Message role. Optional values: system, user, assistant.

user

content

string

No

The specific content of the message.

Hello, please tell me a joke.

temperature

number

No

Sampling temperature, taking a value between 0 and 2. The larger the value, the more random the output; the smaller the value, the more concentrated and certain the output.

0.7

top_p

number

No

Another way to adjust the sampling distribution, taking a value between 0 and 1. It is usually set as an alternative to the temperature.

0.9

n

number

No

How many replies to generate for each input message.

1

stream

boolean

No

Whether to enable streaming output. When set to true, returns streaming data similar to ChatGPT.

false

max_tokens

number

No

The maximum number of tokens that can be generated in a single reply, subject to the model context length limit.

1024

presence_penalty

number

No

-2.0 ~ 2.0. A positive value encourages the model to output more new topics, while a negative value reduces the probability of outputting new topics.

0

frequency_penalty

number

No

-2.0 ~ 2.0. A positive value reduces the frequency of repeated phrases in the model, while a negative value increases the probability of repeated phrases.

0

search_recency_filter

string

No

Returns search results within the specified time range. Optional values: month, week, day, hour.

month

4. Request Example

5. Response Example

Last updated