feat: add `PATCH /api/v1/records/{record_id}` endpoint #3920

gabrielmbmb · 2023-10-10T16:12:07Z

Description

This PR adds a new endpoint PATCH /api/v1/records/{record_id} that allows to partially update a Feedback Dataset record. A new method called SearchEngine.update_record_metadata has been added so the record metadata can also be updated on the SearchEngine.

Type of change

New feature (non-breaking change which adds functionality)

How Has This Been Tested

Unit tests covering the additions have been added.

Checklist

I added relevant documentation
I followed the style guidelines of this project
I did a self-review of my code
I made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I filled out the contributor form (see text above)
I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

src/argilla/server/contexts/datasets.py

src/argilla/server/policies.py

…o feature/add-patch-record-endpoint

github-actions · 2023-10-11T10:42:33Z

The URL of the deployed environment for this PR is https://argilla-quickstart-pr-3920-ki24f765kq-no.a.run.app

…o feature/add-patch-record-endpoint

codecov · 2023-10-11T13:57:30Z

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Files	Coverage Δ
src/argilla/server/apis/v1/handlers/records.py	`98.64% <100.00%> (+0.16%)`	⬆️
src/argilla/server/contexts/datasets.py	`98.71% <100.00%> (+0.37%)`	⬆️
src/argilla/server/models/database.py	`98.63% <100.00%> (ø)`
src/argilla/server/policies.py	`97.56% <100.00%> (+0.04%)`	⬆️
src/argilla/server/schemas/v1/records.py	`100.00% <100.00%> (ø)`
src/argilla/server/search_engine/commons.py	`91.15% <100.00%> (+0.31%)`	⬆️
src/argilla/server/schemas/v1/datasets.py	`99.03% <75.00%> (-0.32%)`	⬇️
src/argilla/server/search_engine/base.py	`81.56% <66.66%> (-0.33%)`	⬇️

📢 Thoughts on this report? Let us know!.

# Description This PR adds the following: - Updates `PATCH /api/v1/records/{record_id}` added in #3920 allowing also to update the suggestions of a record. The suggestions in the input payload will **replace** the old suggestions. - Add new `PATCH /api/v1/datasets/{dataset_id}/records` endpoint allowing to batch/bulk update the records of a dataset. The endpoint allow to update the same attributes from the record as in the `PATCH /api/v1/records/{record_id}` endpoint. - Slightly modify the `SearchDocument` getter dict to do not try to populate the `SearchDocument.responses` attribute if the relationship has not been loaded (this allows us to not to have to load the `Record.responses` when updating the record document in the `SearchEngine` using `add_records`) - Removes `SearchEngine.update_record_metadata` method as the same logic is covered by `SearchEngine.add_records` method, which can be also used to update the fields of an existing document. - Rename `SearchEngine.add_records` method to `SearchEngine.index_records` as it can be used to both add and update records. **Type of change** - [x] New feature (non-breaking change which adds functionality) **How Has This Been Tested** I made an small benchmark to test the latency of the new endpoint. I've created a dataset with 100000 records and all the possible questions and metadata properties. Then I built batches of 1000 records, updating all the responses and metadata fields, and sent them to the API. The average response time of the bulk `PATCH` endpoint was ~= 0.8 seconds. <details> <summary>Code used for benchmark</summary> ```python import uuid import random import argilla as rg LABELS = ["a", "b", "c"] RANKS = ["top-1", "top-2", "top-3"] dataset = rg.FeedbackDataset( fields=[rg.TextField(name="text")], questions=[ rg.TextQuestion(name="text"), rg.RatingQuestion(name="rating", values=[1, 2, 3, 4, 5]), rg.LabelQuestion(name="label", labels=LABELS), rg.MultiLabelQuestion(name="multi-label", labels=LABELS), rg.RankingQuestion(name="ranking", values=RANKS), ], metadata_properties=[ rg.TermsMetadataProperty(name="label", values=LABELS), rg.IntegerMetadataProperty(name="integer", min=0, max=10), rg.FloatMetadataProperty(name="float", min=0, max=10), ], ) dataset.add_records([rg.FeedbackRecord(fields={"text": "Hello"}, metadata={"extra": "yes"}) for _ in range(100000)]) remote = dataset.push_to_argilla(name=f"benchmark-{uuid.uuid4()}", workspace="gabriel") def random_rank_order(): ranks = RANKS.copy() ranks.sort(key=lambda x: random.random()) return [{"value": rank, "rank": i + 1} for i, rank in enumerate(ranks)] def build_update_payload(record): return { "id": str(record.id), "external_id": str(uuid.uuid4()), "metadata": { "label": random.choice(["a", "b", "c"]), "integer": random.randint(0, 10), "float": random.uniform(0, 10), }, "suggestions": [ {"question_id": str(remote.questions[0].id), "value": "hello world" * random.randint(1, 15)}, {"question_id": str(remote.questions[1].id), "value": random.randint(1, 5)}, {"question_id": str(remote.questions[2].id), "value": random.choice(["a", "b", "c"])}, {"question_id": str(remote.questions[3].id), "value": [random.choice(["a", "b", "c"])]}, {"question_id": str(remote.questions[4].id), "value": random_rank_order()}, ], } http_client = rg.active_client().http_client elapseds = [] batch = [] for record in remote.records: batch.append(build_update_payload(record)) if len(batch) == 1000: response = http_client.httpx.patch(f"/api/v1/datasets/{remote.id}/records", json={"items": batch}) elapseds.append(response.elapsed.total_seconds()) batch = [] average_elapsed_time = sum(elapseds) / len(elapseds) print("Average elapsed time", average_elapsed_time) ``` </details> **Checklist** - [ ] I added relevant documentation - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] I made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/) --------- Co-authored-by: frascuchon <[email protected]>

gabrielmbmb added 2 commits October 10, 2023 18:05

feat: add SearchEngine.update_record_metadata method

dad1693

feat: add PATCH /api/v1/records/{record_id} endpoint

79edc9c

gabrielmbmb added type: enhancement Indicates new feature requests area: api Indicates that an issue or pull request is related to the Fast API server or REST endpoints labels Oct 10, 2023

gabrielmbmb added this to the v1.17.0 milestone Oct 10, 2023

gabrielmbmb requested review from jfcalvo and frascuchon October 10, 2023 16:12

gabrielmbmb self-assigned this Oct 10, 2023

gabrielmbmb changed the title ~~Feature/add patch record endpoint~~ feat: add PATCH /api/v1/records/{record_id} endpoint Oct 10, 2023

feat: add update_record_metadat unit tests

92e571c

jfcalvo reviewed Oct 11, 2023

View reviewed changes

src/argilla/server/contexts/datasets.py Show resolved Hide resolved

jfcalvo reviewed Oct 11, 2023

View reviewed changes

src/argilla/server/policies.py Outdated Show resolved Hide resolved

gabrielmbmb added 5 commits October 11, 2023 11:31

feat: add _build_metadata_payload_field function

ec2b287

fix: RecordPolicy.update condition

9d509bf

feat: add PATCH /api/v1/records/{record_id} unit tests

c96211f

Merge branch 'feature/support-for-metadata-filtering-and-sorting' int…

89c85b1

…o feature/add-patch-record-endpoint

fix: add_records test regression

d77b7de

gabrielmbmb marked this pull request as ready for review October 11, 2023 10:05

gabrielmbmb force-pushed the feature/add-patch-record-endpoint branch from bde500e to d77b7de Compare October 11, 2023 12:41

gabrielmbmb added 2 commits October 11, 2023 15:22

Merge branch 'feature/support-for-metadata-filtering-and-sorting' int…

d989c3f

…o feature/add-patch-record-endpoint

Merge branch 'feature/support-for-metadata-filtering-and-sorting' int…

4cf18ad

…o feature/add-patch-record-endpoint

gabrielmbmb merged commit 04893b5 into feature/support-for-metadata-filtering-and-sorting Oct 11, 2023

gabrielmbmb deleted the feature/add-patch-record-endpoint branch October 11, 2023 14:58

gabrielmbmb mentioned this pull request Oct 16, 2023

feat: add PATCH /api/v1/dataset/{dataset_id}/records endpoint #3934

Merged

9 tasks

frascuchon modified the milestones: v1.17.0, v1.18.0 Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `PATCH /api/v1/records/{record_id}` endpoint #3920

feat: add `PATCH /api/v1/records/{record_id}` endpoint #3920

gabrielmbmb commented Oct 10, 2023 •

edited

Loading

github-actions bot commented Oct 11, 2023

codecov bot commented Oct 11, 2023

feat: add PATCH /api/v1/records/{record_id} endpoint #3920

feat: add PATCH /api/v1/records/{record_id} endpoint #3920

Conversation

gabrielmbmb commented Oct 10, 2023 • edited Loading

Description

github-actions bot commented Oct 11, 2023

codecov bot commented Oct 11, 2023

Codecov Report

feat: add `PATCH /api/v1/records/{record_id}` endpoint #3920

feat: add `PATCH /api/v1/records/{record_id}` endpoint #3920

gabrielmbmb commented Oct 10, 2023 •

edited

Loading