-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: add HuggingFaceDatasetMixIn
under integrations
#3326
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Available in Python 3.8 via `typing_extensions`
gabrielmbmb
approved these changes
Jul 4, 2023
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #3326 +/- ##
===========================================
+ Coverage 90.27% 90.35% +0.07%
===========================================
Files 234 236 +2
Lines 12642 12695 +53
===========================================
+ Hits 11413 11471 +58
+ Misses 1229 1224 -5
☔ View full report in Codecov by Sentry. |
Co-authored-by: Gabriel Martin <[email protected]>
leiyre
pushed a commit
that referenced
this pull request
Jul 4, 2023
* develop: refactor: add `HuggingFaceDatasetMixIn` under `integrations` (#3326) feat: add list user workspaces endpoint (#3308) ci: Stop linking issues to team work project chore: add missing `greenlet` dependency in `server` extra (#3330) fix: unit test failing if not local db (#3307) ci: Optimize build + test pipeline (#3300) refactor: simplify old bulk endpoints to avoid create datasets if does not exists (#3306) 📝 Update doc site link (#3299) 🚑 Fix dependencies (#3302) feat: migrate to async SQLAlchemy engine (#3162)
leiyre
pushed a commit
that referenced
this pull request
Jul 5, 2023
* develop: fix: return all workspaces in system for owner users (#3343) fix: `rg.init` with argilla user using quickstart images raise an unexpected error (#3341) feat: add `Suggestion` endpoints (#3304) feat: add `list_user_workspaces` and `User.workspaces` property (#3334) refactor: add `HuggingFaceDatasetMixIn` under `integrations` (#3326) feat: add list user workspaces endpoint (#3308) ci: Stop linking issues to team work project chore: add missing `greenlet` dependency in `server` extra (#3330) docs: update developer docs (#3314) Docs/3312 docs 112 is not building correctly (#3313) fix: unit test failing if not local db (#3307) ci: Optimize build + test pipeline (#3300) refactor: simplify old bulk endpoints to avoid create datasets if does not exists (#3306)
11 tasks
11 tasks
gabrielmbmb
added a commit
that referenced
this pull request
Jul 24, 2023
…k.schemas` (#3427) # Description This PR starts off with the refactoring effort to make sure that everything's more maintainable and scalable. So on, this PR refactors the `argilla/feedback/schemas.py` to be split in different files in a more organised way as `argilla/feedback/schemas/*.py` so that we have `fields.py`, `questions.py` and `records.py` to contain the main `pydantic.BaseModel`s for those. Also all the docstrings have been rewritten from scratch to be clearer and provide more information. Additionally, this PR also adds the `ArgillaDatasetMixin` to detach the Argilla-related functionality from the `FeedbackDataset` itself, as we recently did for the HuggingFace Hub integration (i.e. #3326) **Type of change** - [X] Refactor (change restructuring the codebase without changing functionality) - [X] Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) - [X] Add unit tests for every `pydantic.BaseModel` under `argilla/feedback/schemas` **Checklist** - [ ] I added relevant documentation - [X] follows the style guidelines of this project - [X] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [X] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: Gabriel Martin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
To avoid the increasing size of
argilla/client/feedback/dataset.py
, I've decided to detach the integrations from theFeedbackDataset
class and create a MixIn class to contain all those methods specific to the integrations within theFeedbackDataset
, in this case for 🤗Datasets
.Besides that, I've also renamed the
FeedbackDatasetConfig
toDatasetConfig
, and included some methods to dump a YAML file from now on, instead of a JSON file, since the YAML file is more readable. So now we uploadargilla.yaml
when pushing aFeedbackDataset
to the HuggingFace Hub viapush_to_huggingface
.Type of change
How Has This Been Tested
DeprecationWarning
sChecklist