Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python crashes while building a flow that includes Hugging Face Embeddings sentence_transformers #1869

Open
AlastairJarvis opened this issue May 9, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@AlastairJarvis
Copy link

Describe the bug
While building a flow using Hugging Face Embeddings component (model: sentence-transformers/all-mpnet-base-v2), Python crashes.

Browser and Version

  • Browser : Chrome 124.0.6367.119, Mac
  • Version 1.0.0a26

To Reproduce
Steps to reproduce the behavior:

  1. Go to My Collection
  2. Select Vector Store RAG Flow added from Store
  3. Replace OpenAI Embeddings with Hugging Face Embeddings
  4. Build
  5. See error in Langflow interface
  Error Building Component
       Error building vertex Hugging Face Embeddings: Could not import sentence_transformers python package. Please install it with `pip install sentence-transformers`.
  1. Install sentence-transformers
  2. Attempt to build again.
  3. See error in Terminal window

/usr/local/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
çobjc[8054]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[8054]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
[05/09/24 15:24:22] ERROR 2024-05-09 15:24:22 - ERROR - glogging.py:272
glogging - Worker (pid:8054) was
sent SIGABRT!

Copy link

dosubot bot commented May 9, 2024

Hey there, @AlastairJarvis! 👋 I'm Dosu, a friendly bot here to lend a hand with bugs, answer your queries, or guide you on contributing while we await a human maintainer. Just wanted you to know I'm diving into your issue and will get back to you with a comprehensive response soon! Stay tuned.

@dosubot dosubot bot added the bug Something isn't working label May 9, 2024
@AlastairJarvis
Copy link
Author

Let me know if you can't reproduce and I can send through the Problem Report for Python from the OS.

I get this 100%, and happy to help debug if you're unable to repro it there.

Copy link

dosubot bot commented May 9, 2024

To address the crash issue you're encountering with Python when integrating Hugging Face Embeddings into your Vector Store RAG Flow, consider the following targeted steps:

  1. Ensure Initialization Before Forking: Load all Hugging Face models before any multiprocessing occurs to avoid semaphore and objc fork safety issues.

  2. Adjust Langflow Component Configuration: In the HuggingFaceEmbeddingsComponent, set multi_process to False to prevent multiprocessing complications.

  3. Set Environment Variable on macOS: Use OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES to bypass objc fork safety checks, though proceed with caution as it may affect application stability.

  4. Change Multiprocessing Context: Switch Python's multiprocessing context to "spawn" or "thread" to mitigate issues related to process forking.

  5. Update Libraries: Keep Hugging Face Transformers, Sentence Transformers, and Langflow libraries up to date to benefit from the latest fixes and improvements.

  6. Enhance Debugging in Langflow: Implement detailed logging around the initialization and operation of the HuggingFaceEmbeddingsComponent to identify the crash source more effectively.

Here's an example of how to configure the HuggingFaceEmbeddingsComponent to avoid multiprocessing issues:

hugging_face_component = HuggingFaceEmbeddingsComponent(
    model_name="sentence-transformers/all-mpnet-base-v2",
    multi_process=False  # Avoids multiprocessing issues
)

Additionally, the introduction of the CTransformersComponent in Langflow, with its config parameter for detailed configuration, including multiprocessing settings, suggests enhanced support for Hugging Face model integration. This component is designed to improve the handling of language models by providing customizable deployment and execution options.

If these steps do not resolve the issue, please share any additional error messages or information for further assistance.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@ogabrielluiz
Copy link
Contributor

Hey @AlastairJarvis

Do you know if this component accesses the HuggingFace API or if it loads the model locally?

Running it locally will not work on the 1.0 alpha because it will load the model on every run.

@AlastairJarvis
Copy link
Author

Aha.

When I first tried this component (HuggingFaceEmbeddings), LangFlow prompted me to download and install it, which leads me to believe it's trying to run it locally.

I see HuggingFaceAPI Embeddings is a different component, but it looks like this is also pointing to localhost - so pointing to an API being served locally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants