Changelog

New features, improvements, and fixes in Agenta.

25 May 2024v0.14.14

New LLM Provider: Welcome Gemini!

We are excited to announce the addition of Google's Gemini to our list of supported LLM providers, bringing the total number to 12.

24 May 2024

Playground Improvements

We've improved the workflow for adding outputs to a dataset in the playground. In the past, you had to select the name of the test set each time. Now, the last used test set is selected by default..
We have significantly improved the debugging experience when creating applications from code. Now, if an application fails, you can view the logs to understand the reason behind the failure.
We moved the copy message button in the playground to the output text area.
We now hide the cost and usage in the playground when they aren't specified
We've made improvements to error messages in the playground

Bug Fixes

Fixed the order of the arguments when running a custom code evaluator
Fixed the timestamp in the Testset view (previous stamps was droppping the trailing 0)
Fixed the creation of application from code in the self-hosted version when using Windows

1 May 2024v0.14.0

Prompt and Configuration Registry

We've introduced a feature that allows you to use Agenta as a prompt registry or management system. In the deployment view, we now provide an endpoint to directly fetch the latest version of your prompt. Here is how it looks like:

from agenta import Agenta
agenta = Agenta()
config = agenta.get_config(base_id="xxxxx", environment="production", cache_timeout=200) # Fetches the configuration with caching

You can find additional documentation here.

Improvements

Previously, publishing a variant from the playground to an environment was a manual process., from now on we are publishing by default to the production environment.

28 April 2024v0.13.8

Miscellaneous Improvements

The total cost of an evaluation is now displayed in the evaluation table. This allows you to understand how much evaluations are costing you and track your expenses.

Bug Fixes

Fixed sidebar focus in automatic evaluation results view
Fix the incorrect URLs shown when running agenta variant serve

23 April 2024

Evaluation Speed Increase and Numerous Quality of Life Improvements

We've improved the speed of evaluations by 3x through the use of asynchronous batching of calls.
We've added Groq as a new provider along with Llama3 to our playground.

Bug Fixes

Resolved a rendering UI bug in Testset view.
Fixed incorrect URLs displayed when running the 'agenta variant serve' command.
Corrected timestamps in the configuration.
Resolved errors when using the chat template with empty input.
Fixed latency format in evaluation view.
Added a spinner to the Human Evaluation results table.
Resolved an issue where the gitignore was being overwritten when running 'agenta init'.

14 April 2024v0.13.0

Observability (beta)

You can now monitor your application usage in production. We've added a new observability feature (currently in beta), which allows you to:

Monitor cost, latency, and the number of calls to your applications in real-time.
View the logs of your LLM calls, including inputs, outputs, and used configurations. You can also add any interesting logs to your test set.
Trace your more complex LLM applications to understand the logic within and debug it.

As of now, all new applications created will include observability by default. We are working towards a GA version in the next weeks, which will be scalable and better integrated with your applications. We will also be adding tutorials and documentation about it.