The documentation is missing some details: https://docs.lambdalabs.com/public-cloud/lambda-chat-api/
When I run a prompt through it I get this back:
{
"id": "chat-dea5c8eddcfa4ad08d488f2501f1b3b4",
"object": "chat.completion",
"created": 1730717593,
"model": "hermes3-405b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas. Due to the COVID-19 pandemic, the entire series was held at this neutral site to reduce travel and potential exposure to the virus."
},
"finish_reason": "stop",
"content_filter_results": {
"hate": {
"filtered": false
},
"self_harm": {
"filtered": false
},
"sexual": {
"filtered": false
},
"violence": {
"filtered": false
},
"jailbreak": {
"filtered": false,
"detected": false
},
"profanity": {
"filtered": false,
"detected": false
}
}
}
],
"usage": {
"prompt_tokens": 65,
"completion_tokens": 45,
"total_tokens": 110,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": ""
}
Those content_filter_results look interesting - especially if I can turn those options on or off (I'd like to experiment with the jailbreak one for example) - but they aren't mentioned in the documentation at the moment.
It's a fine-tuned model over Llama 3.1, which is the kind of thing I -- a non-expert -- would want to do and have though of doing if I trained a LLM for a specific programming language for private (non-cloud-hosted) use. So this is both interesting and yet I lack to knowledge to really understand its impact in the LLM world.
> the model displays significant improvements in judgment and reward modeling.
This seems significant?
Technical report: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3...