21 Jul 07:35

f8a5c0c

v1.13.0

🔆 Highlights

✨ Suggestions

You can now add suggestions to your Feedback datasets. This feature enhances the feedback collection process by providing machine-generated feedback to labelers that appears as pre-filled responses. In this way, they act as an aid for labelers' efficiency, who will only need to correct the responses that they don't agree with.

All question types in the Feedback task support suggestions, but you can only add one suggestion per question.

Learn more about this feature in our docs.

🗄️ List workspaces

We've added functionalities to list all the workspaces that a user has access to. From the Python client you will be able to list all workspaces of the current user using rg.Workspace.list() and in the UI you will be able to see the list of workspaces in the user settings page.

🏋️‍♂️ Extended training support

We are extending the support we give to help preparing data from Feedback datasets to use during training. As part of this release we include strategies to unify responses to RankingQuestions and also provide a task mapping for text classification TrainingTaskMapping.for_text_classification.

Read more about how to use these methods to train models with Feedback collected in Argilla here.

Changelog 1.13.0

Added

Added GET /api/v1/users/{user_id}/workspaces endpoint to list the workspaces to which a user belongs (#3308 and #3343).
Added HuggingFaceDatasetMixin for internal usage, to detach the FeedbackDataset integrations from the class itself, and use Mixins instead (#3326).
Added GET /api/v1/records/{record_id}/suggestions API endpoint to get the list of suggestions for the responses associated to a record (#3304).
Added POST /api/v1/records/{record_id}/suggestions API endpoint to create a suggestion for a response associated to a record (#3304).
Added support for RankingQuestionStrategy, RankingQuestionUnification and the .for_text_classification method for the TrainingTaskMapping (#3364)
Added PUT /api/v1/records/{record_id}/suggestions API endpoint to create or update a suggestion for a response associated to a record (#3304 & 3391).
Added suggestions attribute to FeedbackRecord, and allow adding and retrieving suggestions from the Python client (#3370)
Added allowed_for_roles Python decorator to check whether the current user has the required role to access the decorated function/method for User and Workspace (#3383)
Added API and Python Client support for workspace deletion (Closes #3260)
Added GET /api/v1/me/workspaces endpoint to list the workspaces of the current active user (#3390)

Changed

Updated output payload for GET /api/v1/datasets/{dataset_id}/records, GET /api/v1/me/datasets/{dataset_id}/records, POST /api/v1/me/datasets/{dataset_id}/records/search endpoints to include the suggestions of the records based on the value of the include query parameter (#3304).
Updated POST /api/v1/datasets/{dataset_id}/records input payload to add suggestions (#3304).
The POST /api/datasets/:dataset-id/:task/bulk endpoints don't create the dataset if does not exists (Closes #3244)
Added Telemetry support for ArgillaTrainer (closes #3325)
User.workspaces is no longer an attribute but a property, and is calling list_user_workspaces to list all the workspace names for a given user ID (#3334)
Renamed FeedbackDatasetConfig to DatasetConfig and export/import from YAML as default instead of JSON (just used internally on push_to_huggingface and from_huggingface methods of FeedbackDataset) (#3326).
The protected metadata fields support other than textual info - existing datasets must be reindex. See docs for more detail (Closes #3332).
Updated Dockerfile parent image from python:3.9.16-slim to python:3.10.12-slim (#3425).
Updated quickstart.Dockerfile parent image from elasticsearch:8.5.3 to argilla/argilla-server:${ARGILLA_VERSION} (#3425).

Removed

Removed support to non-prefixed environment variables. All valid env vars start with ARGILLA_ (See #3392).

Fixed

Fixed GET /api/v1/me/datasets/{dataset_id}/records endpoint returning always the responses for the records even if responses was not provided via the include query parameter (#3304).
Values for protected metadata fields are not truncated (Closes #3331).
Big number ids are properly rendered in UI (Closes #3265)
Fixed ArgillaDatasetCard to include the values/labels for all the existing questions (#3366)

Deprecated

Integer support for record id in text classification, token classification and text2text datasets.

As always, thanks to our amazing contributors

@manijhariya made their first contribution in #3295

Full Changelog: v1.12.1...1.13.0

Contributors

manijhariya

Assets 2

12 Jul 13:06

frascuchon

v1.12.1

4cc3189

v1.12.1

1.12.1

Fixed

Using rg.init with default argilla user skips setting the default workspace if not available. (Closes #3340)
Resolved wrong import structure ArgillaTrainer TrainingTaskMapping (Closes #3345)
Pin pydantic dependency to version < 2 (Closes 3348)

Assets 2

29 Jun 14:27

gabrielmbmb

v1.12.0

9c4fcb6

v1.12.0

🔆 Highlights

New `RankingQuestion` in Feedback Task datasets

Now you will be able to include RankingQuestions in your Feedback datasets. These are specially designed to gather feedback on labeler's preferences, by providing a set of options that labelers can order.
Here's how you can add a RankingQuestion to a FeedbackDataset:

dataset = FeedbackDataset(
    fields=[
        rg.TextField(name="prompt"),
        rg.TextField(name="reply-1", title="Reply 1"),
        rg.TextField(name="reply-2", title="Reply 2"),
        rg.TextField(name="reply-3", title="Reply 3"),
    ],
    questions=[
        rg.RankingQuestion(
            name="ranking",
            title="Order replies based on your preference",
            description="1 = best, 3 = worst. Ties are allowed.",
            required=True,
            values={"reply-1": "Reply 1", "reply-2": "Reply 2", "reply-3": "Reply 3"} # or ["reply-1", "reply-2", "reply-3"]
    ]
)

More info in our docs.

Extended training support

You can now format responses from RatingQuestion, LabelQuestion and MultiLabelQuestion for your preferred training framework using the prepare_for_training method.

Also, we've added support for spacy-transformers in our Argilla Trainer.

Here's an example code snippet:

import argilla.feedback as rg

dataset = rg.FeedbackDataset.from_huggingface(
    repo_id="argilla/stackoverflow_feedback_demo"
)
task_mapping = rg.TrainingTaskMapping.for_text_classification(
    text=dataset.field_by_name("question"),
    label=dataset.question_by_name("tags")
)
trainer = rg.ArgillaTrainer(
    dataset=dataset,
    task_mapping=task_mapping,
    framework="spacy-transformers",
    fetch_records=False
)
trainer.update_config(num_train_epochs=2)
trainer.train(output_dir="my_awesone_model")

To learn more about how to use Argilla Trainer check our docs.