← Blog

Open AI Stream response missing the first token

Open AI

The streaming response from OpenAI has undergone changes, impacting the handling and parsing of the response. A more robust approach is required to effectively manage this.

const completion = await openai.createChatCompletion({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: template }],
stream: true,
temperature:0.7
}, { responseType: 'stream' });
const stream = completion.data;
stream.on('data', (chunk) => {
const payloads = chunk.toString().split("\n\n");
for (const payload of payloads) {
if (payload.includes('[DONE]')) return;
console.log(payload)
}
});
so the payloads here is the array returned as a chunk from Open ai in a stream, previously the chunks would be

chunk -1

[
`data: {"id":"chatcmpl-8Jh3daCazSSQaONS2m3x8ThuZ7NXw","object":"chat.completion.chunk","created":1699704437,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"content":"Hell"},"finish_reason":null}]}`,
''
]

chunk-2


[
'data: {"id":"chatcmpl-8Jh3daCazSSQaONS2m3x8ThuZ7NXw","object":"chat.completion.chunk","created":1699704437,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"content":"o world"},"finish_reason":null}]}',
''
]

BUT after openai mode changed the payloads’s elements which are strings are spliced
chunk -1

[
`data: {"id":"chatcmpl-8Jh3daCazSSQaONS2m3x8ThuZ7NXw","object":"chat.completion.chunk","created":1699704437,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"content":"Hell"},"finish_reason":null}]}`,
''data: {"id":"chatcmpl-8Jh3daCazSSQaONS2m3x8ThuZ7NXw","o'
]

chunk-2

[
'bject":"chat.completion.chunk","created":1699704437,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"content":"o world"},"finish_reason":null}]}',
''
]

so parsing code that you have used to handle the chunks may not be suitable for processing a spliced JSON string.

The point is we need just the content, ignore rest other fields

const regex = /"choices":.*?"delta":\{.*?"content":"(?<newToken>.*?)"/s;
stream.on('data', (chunk) => {
const payloads = chunk.toString().split("\n\n");
for (const payload of payloads) {
if (payload.includes('[DONE]')) return;
const matchPattern = regex.exec(payload);
if (matchPattern && matchPattern.groups.newToken) {
try {
let chunk = matchPattern.groups.newToken;
if (chunk) {
console.log(chunk);
res.write(chunk);
}
} catch (error) {
console.log(`Error with JSON.parse and ${payload}.\n${error}`);
}
}
}
});
const regex = /"choices":.*?"delta":\{.*?"content":"(?<newToken>.*?)"/s;

This regular expression is crafted to extract information from a string that follows a specific pattern. It looks for a substring starting with "choices":, followed by any characters until it encounters "delta":{, and then continues to capture any characters until it finds the substring "content":". The content between "content":" and the next occurrence of " is placed in a named capturing group called newToken.

As I said before we need only the content and the regex does that, ignores all other fields.

View original on Medium ↗


Comments (0)

No comments yet. Be the first.

Leave a comment