Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update FilteredRemoteFeedbackRecords.__len__ method #3916

Conversation

gabrielmbmb
Copy link
Member

@gabrielmbmb gabrielmbmb commented Oct 10, 2023

Description

This PR updates the __len__ method of the FilteredRemoteFeedbackRecords to return the real number of records matching the filters. To do so, we're not using any special endpoint but the list dataset records with offset=0 and limit=0.

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested

In a local development environment:

import random
import argilla as rg

dataset = rg.FeedbackDataset(
    fields=[
        rg.TextField(name="text")
    ],
    questions=[
        rg.TextQuestion(name="question", required=True)
    ],
    metadata_properties=[
        rg.TermsMetadataProperty(name="letters", values=["a", "b", "c"])
    ]
)

records = [rg.FeedbackRecord(
        fields={"text": "Hello world!" * random.randint(5, 50)},
        responses=[{"values": {"question": {"value": "Hello world!" * random.randint(5, 50)}}}],
        metadata={"letters": random.choice(["a", "b", "c"])}
    ) for _ in range(100000)]

dataset.add_records(records)
remote_dataset = dataset.push_to_argilla(name="random-dataset-2", workspace="gabriel")
filtered_dataset = remote_dataset.filter_by(metadata_filters=[rg.TermsMetadataFilter(name="letters", values=["a"])])
len(filtered_dataset) # ~= 33000

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I filled out the contributor form (see text above)
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@gabrielmbmb gabrielmbmb added type: enhancement Indicates new feature requests client labels Oct 10, 2023
@gabrielmbmb gabrielmbmb added this to the v1.17.0 milestone Oct 10, 2023
@gabrielmbmb gabrielmbmb self-assigned this Oct 10, 2023
@gabrielmbmb gabrielmbmb marked this pull request as ready for review October 10, 2023 12:54
@github-actions
Copy link

The URL of the deployed environment for this PR is https://argilla-quickstart-pr-3916-ki24f765kq-no.a.run.app

@gabrielmbmb gabrielmbmb merged commit a3c7a7b into feature/support-for-metadata-filtering-and-sorting Oct 10, 2023
@gabrielmbmb gabrielmbmb deleted the feature/update-filtered-dataset-len-method branch October 10, 2023 13:58
@frascuchon frascuchon modified the milestones: v1.17.0, v1.18.0 Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Indicates new feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants