Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add labeled data and fix broken links #2015

Conversation

Lalith-Sagar-Devagudi
Copy link
Contributor

Description

Added labeled Text and Images data snippets to the notebook. The labeled data can be used to implement or test various models like Classification models.

Text (labeled):

A JSON file of dicts with two keys "x" and "y". The data is derived from IMDB data where the key "x" have a review text as its value and the key "y" have either 0 (negative review) (487) or 1 (positive review) (513) as its value. A total of 1000 dicts or reviews are present in the text_labeled.json file

Image (labeled):

A zipped folder of total 1000 images of cats (494) and dogs (506) along with a JSON file.
The JSON file consists of dicts with two keys "x" and "y". The data is derived from tensorflow cats_vs_dogs dataset where the key "x" have image file path as its value and the key "y" have either 0 (cat) or 1 (dog) as its value. A total of 1000 dicts or images are present in the images_labeled.json file

Related Issues

[TEST-USE] Transfer learning #1967

Checklist

  • Is this code covered by new or existing unit tests or integration tests?
  • Did you run make unit-testing and make integration-testing successfully?
  • Do new classes, functions, methods and parameters all have docstrings?
  • Were existing docstrings updated, if necessary?
  • Was external documentation updated, if necessary?

Copy link
Collaborator

@blythed blythed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on other PR.

@Lalith-Sagar-Devagudi Lalith-Sagar-Devagudi force-pushed the reusable_snippets/update-get-data branch 2 times, most recently from be7f166 to 35f88a0 Compare May 10, 2024 06:20
Copy link
Collaborator

@jieguangzhou jieguangzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix the RAG use case, it won't affect me on my end

@blythed blythed force-pushed the reusable_snippets/update-get-data branch from 35f88a0 to 7837f30 Compare May 14, 2024 07:57
@blythed
Copy link
Collaborator

blythed commented May 16, 2024

@Lalith-Sagar-Devagudi please resolve the conflicts and then we can merge.

@blythed blythed merged commit 2cc844d into SuperDuperDB:main May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants