OpenAI Compatible API
NovelAI provides OpenAI-compatible API endpoints, allowing you to interact with NovelAI's text generation services using the standard OpenAI API format. This reduces migration costs from OpenAI and improves interoperability with existing tools.
Note
The OpenAI-compatible API uses different models than the native text generation API:
- OpenAI Compatible API: Uses GLM series models (
glm-4-5,glm-4-6) - Native Text Generation API: Uses NovelAI proprietary models (
llama-3-erato-v1,kayra-v1,clio-v1)
GLM models are general-purpose chat models, while Erato/Kayra/Clio are models specifically optimized for story writing.
Available Models
Get currently available models via listModels():
| Model | Description |
|---|---|
glm-4-6 | GLM-4 Version 6 (Recommended) |
glm-4-5 | GLM-4 Version 5 |
typescript
const models = await client.openai.listModels();
// Returns: [{ id: 'glm-4-5', owned_by: 'novelai' }, { id: 'glm-4-6', owned_by: 'novelai' }]Endpoint Overview
| Endpoint | Method | Full URL | Description |
|---|---|---|---|
/oa/v1/completions | POST | https://text.novelai.net/oa/v1/completions | Text completion |
/oa/v1/chat/completions | POST | https://text.novelai.net/oa/v1/chat/completions | Chat completion |
/oa/v1/models | GET | https://text.novelai.net/oa/v1/models | List available models |
/oa/v1/internal/token-count | POST | https://text.novelai.net/oa/v1/internal/token-count | Token count |
Basic Usage
Text Completion
typescript
import { NovelAI } from 'novelai-sdk-unofficial';
const client = new NovelAI({ apiKey: 'your-api-key' });
// First get available models
const models = await client.openai.listModels();
const model = models[0].id; // e.g. 'glm-4-6'
const response = await client.openai.completion({
prompt: 'Once upon a time, in a kingdom far away,',
model,
maxTokens: 100,
temperature: 0.7,
});
console.log(response.choices[0].text);Chat Completion
typescript
const response = await client.openai.chatCompletion({
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello! How are you?' },
],
model: 'glm-4-6',
maxTokens: 100,
});
console.log(response.choices[0].message.content);Streaming Responses
Text Completion Streaming
typescript
const stream = client.openai.completionStream({
prompt: 'The quick brown fox',
model: 'glm-4-6',
maxTokens: 100,
});
for await (const chunk of stream) {
const text = chunk.choices[0]?.text;
if (text) process.stdout.write(text);
}Chat Completion Streaming
typescript
const stream = client.openai.chatCompletionStream({
messages: [{ role: 'user', content: 'Tell me a story' }],
model: 'glm-4-6',
maxTokens: 200,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}Cancelling Streaming Requests
typescript
const controller = new AbortController();
// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);
try {
for await (const chunk of client.openai.chatCompletionStream({
messages: [{ role: 'user', content: 'Write a long story' }],
}, controller.signal)) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
} catch (error) {
if (error.name === 'AbortError') {
console.log('\nGeneration cancelled');
}
}List Models
typescript
const models = await client.openai.listModels();
for (const model of models) {
console.log(`${model.id} (owned by ${model.owned_by})`);
}Token Count
typescript
const count = await client.openai.tokenCount({
prompt: 'Hello, world! This is a test.',
model: 'llama-3-erato-v1',
});
console.log(`Token count: ${count}`);Parameters
Completion Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | string | required | Input prompt text |
model | string | required | Model to use (e.g. glm-4-6) |
maxTokens | number | - | Maximum tokens to generate (1-2048) |
temperature | number | - | Sampling temperature (0-2) |
topP | number | - | Nucleus sampling threshold (0-1) |
topK | number | - | Top-K sampling |
minP | number | - | Min-P sampling threshold (0-1) |
stop | string | string[] | - | Stop sequences |
stream | boolean | false | Enable streaming response |
n | number | - | Number of completions to generate |
frequencyPenalty | number | - | Frequency penalty (-2.0 to 2.0) |
presencePenalty | number | - | Presence penalty (-2.0 to 2.0) |
seed | number | - | Random seed |
Chat Completion Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
messages | OAIChatMessage[] | required | Array of chat messages |
model | string | required | Model to use (e.g. glm-4-6) |
maxTokens | number | - | Maximum tokens to generate (1-2048) |
temperature | number | - | Sampling temperature (0-2) |
topP | number | - | Nucleus sampling threshold (0-1) |
topK | number | - | Top-K sampling |
minP | number | - | Min-P sampling threshold (0-1) |
stop | string | string[] | - | Stop sequences |
stream | boolean | false | Enable streaming response |
enableThinking | boolean | - | Enable thinking/reasoning mode |
Message Format
typescript
interface OAIChatMessage {
role: 'system' | 'user' | 'assistant';
content: string;
name?: string; // Optional author name
}Unified Sampling Parameters
NovelAI supports additional unified sampling parameters:
| Parameter | Type | Description |
|---|---|---|
unifiedLinear | number | Unified linear sampling parameter |
unifiedQuadratic | number | Unified quadratic sampling parameter |
unifiedCubic | number | Unified cubic sampling parameter |
unifiedIncreaseLinearWithEntropy | number | Unified entropy-increase linear parameter |
Comparison with Native API
| Feature | OpenAI Compatible API | Native API |
|---|---|---|
| Interface Style | OpenAI standard format | NovelAI native format |
| Available Models | GLM series (glm-4-5, glm-4-6) | Erato/Kayra/Clio |
| Model Characteristics | General-purpose chat models | Story writing specialized models |
| Chat Support | ✅ Built-in | ❌ Manual construction |
| Migration Cost | Low | High |
| Parameter Naming | camelCase | snake_case |
| Streaming Format | SSE (data: JSON) | Native stream |
When to Use OpenAI Compatible API
- Migrating existing code from OpenAI
- Integrating with OpenAI-compatible tools
- Building general chat applications
- Need standardized API format
When to Use Native API
- Need NovelAI's proprietary story writing models (Erato/Kayra/Clio)
- Need finer parameter control
- Already have NovelAI native code
- Need best story generation quality
Response Format
Completion Response
typescript
interface OAICompletionResponse {
id: string;
object: 'text_completion';
created: number;
model: string;
choices: Array<{
text: string;
index: number;
finish_reason: 'stop' | 'length' | null;
}>;
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}Chat Completion Response
typescript
interface OAIChatCompletionResponse {
id: string;
object: 'chat.completion';
created: number;
model: string;
choices: Array<{
index: number;
message: {
role: 'assistant';
content: string;
};
finish_reason: 'stop' | 'length' | null;
}>;
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}Error Handling
typescript
import {
NovelAI,
AuthenticationError,
InvalidRequestError,
RateLimitError
} from 'novelai-sdk-unofficial';
try {
const response = await client.openai.chatCompletion({
messages: [{ role: 'user', content: 'Hello' }],
});
} catch (error) {
if (error instanceof AuthenticationError) {
console.error('Invalid API Key');
} else if (error instanceof InvalidRequestError) {
console.error('Invalid request parameters:', error.message);
} else if (error instanceof RateLimitError) {
console.error('Rate limit exceeded');
}
}