Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't pickle generics in certain cases #9390

Open
1 task done
strangemonad opened this issue May 4, 2024 · 3 comments
Open
1 task done

Can't pickle generics in certain cases #9390

strangemonad opened this issue May 4, 2024 · 3 comments

Comments

@strangemonad
Copy link

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

The reified generic classes only seem to be registered as a value in their modules in some cases. The test case below illustrates the bug.

This may be a dupe of the root cause of #7503.

The core issue seems to be that pickle (at least in python 3.11) first tries to find a for the class to pickle using __qualname__
https://github.com/python/cpython/blob/978fba58aef347de4a1376e525df2dacc7b2fff3/Lib/pickle.py#L1062. This seems to be the case since at least 3.8

But Pydantic only seems to be registering a symbol for the reified generic classes when they're created at a global level

if called_globally: # create global reference and therefore allow pickling

I'm not close enough to the details of the original implementation by @dmontagu in 53fcbec but is there a reason to only restrict the updating of module references to when things are run at the top-level? What would be the downside of updating the origin.__module__ to always have the reified generic instances?

Some off the cuff thoughts around options:

  • eliminating the type var args in the generic subclass __qualname__ will probably have a bunch of other, unintended side-effects. It also means that when you're un-pickling, it would use the raw generic base class
  • Whichever way the specialized generic subclass is registered, there needs to be some portion of the un-pickling code path that specifies that same generic type and args e.g. if you're starting a new process and my.module.MyGeneric[T] exists on the import path but you haven't ever defined a MyGeneric[str] un-pickling with fail with a similar error to how pickling currently fails in the test case below

Example Code

import pickle
from typing import Generic, TypeVar

from pydantic import BaseModel

T = TypeVar("T")


class MyGeneric(BaseModel, Generic[T]):
    prop: T


def create_and_pickle():
    m = MyGeneric[str](prop="test")
    print(m.__class__.__qualname__)
    print(pickle.dumps(m))


# If you uncomment this next line, it has the effect of registering a __qualname__ of
# "MyGeneric[str]" in this module and make this test case work.
# MyGeneric[str]

create_and_pickle()

Python, Pydantic & OS Version

pydantic version: 2.6.4
        pydantic-core version: 2.16.3
          pydantic-core build: profile=release pgo=true
                 install path: /Users/shawn/Code/instance-bio/instance/services/web/.venv/lib/python3.11/site-packages/pydantic
               python version: 3.11.4 (main, Aug 14 2023, 09:41:08) [Clang 14.0.3 (clang-1403.0.22.14.1)]
                     platform: macOS-14.4.1-arm64-arm-64bit
             related packages: typing_extensions-4.11.0 pyright-1.1.321 pydantic-extra-types-2.7.0 fastapi-0.110.2 pydantic-settings-2.2.1
                       commit: unknown
@strangemonad strangemonad added bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels May 4, 2024
@sydney-runkle
Copy link
Member

Hi @strangemonad,

Thanks for your questions. I think this is a duplicate of #8913. Let me know if that explanation is helpful enough for your use case!

@strangemonad
Copy link
Author

@sydney-runkle it does seem like a dupe of #8913. I suppose I'd question the rationale for closing the issue though? Manually managing dynamic models is always discussed as an "advanced pydantic topic" that should only be needed in a few cases while generics are documented as a feature that should "just work". At the very least this should probably be a change to the docs or a better error message (e.g. by overriding the reduce method of generic pydantic models to make it clear it's not supported) but I don't think it's unreasonable to support since generics are supported as first-class.

I'm currently working around this with the following mixin that I include in the class hierarchy. Since there's already a case in the ModelMetaClass that handles this code path when module == __main__ I don't think it should be too difficult to extend

class PatchGenericPickle:
    """A mixin that allows generic pydantic models to be serialized and deserialized with pickle.

    Notes
    ----
    In general, pickle shouldn't be encouraged as a means of serialization since there are better,
    safer options. In some cases e.g. Streamlit's `@st.cache_data there's no getting around
    needing to use pickle.

    As of Pydantic 2.7, generics don't properly work with pickle. The core issue is the following
    1. For each specialized generic, pydantic creates a new subclass at runtime. This class
       has a `__qualname__` that contains the type var argument e.g. `"MyGeneric[str]"` for a
       `class MyGeneric(BaseModel, Generic[T])`.
    2. Pickle attempts to find a symbol with the value of `__qualname__` in the module where the
       class was defined, which fails since Pydantic defines that class dynamically at runtime.
       Pydantic does attempt to register these dynamic classes but currently only for classes
       defined at the top-level of the interpreter.

    See Also
    --------
    - https://github.com/pydantic/pydantic/issues/9390
    """

    @classmethod
    @override
    def __init_subclass__(cls, **kwargs):
        # Note: we're still in __init_subclass__, not yet in __pydantic_init_subclass__
        #  not all model_fields are available at this point.
        super().__init_subclass__(**kwargs)

        if not issubclass(cls, BaseModel):
            raise TypeError(
                "PatchGenericPickle can only be used with subclasses of pydantic.BaseModel"
            )
        if not issubclass(cls, Generic):
            raise TypeError("PatchGenericPickle can only be used with Generic models")

        qualname = cls.__qualname__
        declaring_module = sys.modules[cls.__module__]
        if qualname not in declaring_module.__dict__:
            # This should work in all cases, but we might need to make this check and update more
            # involved e.g. see pydantic._internal._generics.create_generic_submodel
            declaring_module.__dict__[qualname] = cls

@sydney-runkle
Copy link
Member

@strangemonad,

Fair point! I'll tag this with the documentation tag - we can definitely start with that as a first step.

@sydney-runkle sydney-runkle added documentation and removed bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants