Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for missing property on union type causes failure in subsequent property checks #58448

Open
molisani opened this issue May 6, 2024 · 17 comments
Labels
Help Wanted You can do this Possible Improvement The current behavior isn't wrong, but it's possible to see that it might be better in some cases
Milestone

Comments

@molisani
Copy link
Contributor

molisani commented May 6, 2024

🔎 Search Terms

"missing property", "property check", "type union", "control flow analysis"

🕗 Version & Regression Information

  • This is the behavior in every version I tried*, and I reviewed the FAQ for entries about property checks

*There was a minor change between 4.8 and 4.9 that changed the types but did not change the behavior: inference improved from never to incorrectUnionElement & Record<"key", unknown>.

⏯ Playground Link

https://www.typescriptlang.org/play/?ts=5.4.5#code/FASwdgLgpgTgZgQwMZQARIPYFsAOBGVAb2FVNRigQBMMwAbAT1QQC5UwBXLAI1gG4SZCtVqNU3Npx79gAX2ChIsRCnTYcAJiKDSwmvSat2XXjAHzgEBjjSZcqALw7UAHzW48zt3c0K4HMCQIEFpUGgB9AGdsKAgAC3AAcwAKAEptMjUwSIh3HEciWWZIvIFnEDhUZIAiBGrUcDz04kzMnwA6BHaIDABlCBgktIFWsgB6MdQAPQB+ZwtMiqrq7nrGn2bnNvV27m6+gaHUkdHUCem5zIWyJZqkNbAmjNOOpH3+wbAU463xydn5gpFpUanUGo8Ns9Rh0uj0PkcTqNzgCrkCbiCVg8ni0Xjs9nDDl9hr9SMjLmR5LIgA

💻 Code

interface comp1 {
    readonly a: number;
    readonly b: number;
}

interface comp2 {
    readonly a: number;
}

type comp =
    | comp1
    | comp2

function do_something() {
    const comp = {} as comp;

    if ("a" in comp) {
        comp.a.toString();
        // ^? comp
    }

    if ("b" in comp) {
        comp.b.toString();
        // ^? comp1
    }

    if ("c" in comp) {
        comp.c.toString();
        // ^? comp & Record<"c", unknown>
    }

    if ("a" in comp) {
        comp.a.toString();
        // ^? comp2
    }

    if ("b" in comp) {
        comp.b.toString();
        // ^? comp2 & Record<"b", unknown>
    }
}

🙁 Actual behavior

Every property check after the ("c" in comp) one fails to narrow correctly if the key is one that is only defined for some of the element types of the type union. Attempting to access the property after one of these checks results in unknown. As well, the narrowed variable in subsequent conditionals is an arbitrary(?) union element.

🙂 Expected behavior

The conditionals that come after the ("c" in comp) conditional should have the same behavior as the ones before it, as they are the exact same code.

Additional information about the issue

This was initially noticed in a more complicated context with assertions, that I can include here as a secondary example. This one is even more strange, as it is a check for a property that is only defined on some of the elements of the union and it blocks its own duplicate check later.

@RyanCavanaugh RyanCavanaugh added Help Wanted You can do this Possible Improvement The current behavior isn't wrong, but it's possible to see that it might be better in some cases labels May 6, 2024
@RyanCavanaugh
Copy link
Member

This is a consequence of how control flow causes subtype reduction. It might be fixable but the comp type is always going to have problems of this nature due to the possible reduction.

@molisani
Copy link
Contributor Author

molisani commented May 6, 2024

So there are certain types of control flow analyses that can alter the type beyond the end of a (non-returning) block? Is that necessary for certain checks/behaviors or just a side effect of the current implementation?

We have a workaround for this issue, so it isn't critical at the moment that this is fixed upstream. That being said I'd still be interested in a solution.

@RyanCavanaugh
Copy link
Member

The basic pattern here is whenever control flow has two different ways to get to the same place, it's going to union the upstream types, and that unioning is subject to reduction. So in the most basic example

function do_something() {
    const comp = {} as comp;
    comp;
    // ^?
    if ("c" in comp) {
    }
    // --> here <--
    comp;
    // ^?
}

At here, there are two ways to get there - one via the body of the if (which could have effects, e.g. a conditional early return), and one not. So there's a union of the incoming types, comp & Record<"c", unknown>, and comp1 | comp2. This subtype-reduces to just comp2

Without reduction, you'd see weird artifacts, e.g.

function foo(x: Animal) {
  if (isCat(x)) { /* no-op */ }
  // --> here <--
}

In this example, at here, the "correct" type is just back to Animal, not Cat | Animal (a somewhat meaningless union type the same way comp1 | comp2 is)

@RyanCavanaugh RyanCavanaugh added this to the Backlog milestone May 6, 2024
@molisani
Copy link
Contributor Author

molisani commented May 8, 2024

So there's a union of the incoming types, comp & Record<"c", unknown>, and comp1 | comp2. This subtype-reduces to just comp2

Thanks for elucidating the behavior of the incoming types, but I'm still not quite clear why comp & Record<"c", unknown> and comp1 | comp2 subtype-reduces to comp2. I can understand that comp1 and comp2 separately would subtype-reduce to comp2, but how does comp & Record<"c", unknown> change that?

@RyanCavanaugh
Copy link
Member

Subtype reduction doesn't treat (A | B) | C different from A | B | C

@molisani
Copy link
Contributor Author

molisani commented May 8, 2024

So the subtype reduction of comp & Record<"c", unknown> and comp is really the subtype reduction of comp1, comp2, comp1 & Record<"c", unknown>, and comp2 & Record<"c", unknown>?

That feels wrong somehow, but I see now what you mean that these kinds of types will have issues due to this reduction.

@molisani
Copy link
Contributor Author

molisani commented May 8, 2024

https://github.com/microsoft/TypeScript/blob/bc14459f3b6af5a9547631a590fdfc0fdf35d042/src/compiler/checker.ts#L17191-L17201

            const namedUnions: Type[] = [];
            addNamedUnions(namedUnions, types);
            const reducedTypes: Type[] = [];
            for (const t of typeSet) {
                if (!some(namedUnions, union => containsType((union as UnionType).types, t))) {
                    reducedTypes.push(t);
                }
            }
            if (!aliasSymbol && namedUnions.length === 1 && reducedTypes.length === 0) {
                return namedUnions[0];
            }

Looking at checker.ts, inside getUnionTypeWorker there's a call to addNamedUnions. This ends up being (part of) the problem for this case as it passes in comp & Record<"c", unknown> and comp and both of those types are added as named unions. If the intersection wasn't considered a named union then getUnionTypeWorker would terminate early with just comp.

https://github.com/microsoft/TypeScript/blob/bc14459f3b6af5a9547631a590fdfc0fdf35d042/src/compiler/checker.ts#L17105-L17117

    function addNamedUnions(namedUnions: Type[], types: readonly Type[]) {
        for (const t of types) {
            if (t.flags & TypeFlags.Union) {
                const origin = (t as UnionType).origin;
                if (t.aliasSymbol || origin && !(origin.flags & TypeFlags.Union)) {
                                  // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    pushIfUnique(namedUnions, t);
                }
                else if (origin && origin.flags & TypeFlags.Union) {
                    addNamedUnions(namedUnions, (origin as UnionType).types);
                }
            }
        }
    }

addNamedUnions considers comp & Record<"c", unknown> to be a named union because even though it doesn't have an alias symbol it is a union1 and it's origin type2 is not a union. I wonder what the origin of this conditional is, because if we remove it (and just check for alias symbol) this issue is resolved3. Do you think this is a worthwhile avenue to explore for a potential solution?

Footnotes

  1. It is the resolvedReducedType of that intersection, which is (comp1 & Record<"c", unknown>) | (comp2 & Record<"c", unknown>)

  2. comp & Record<"c", unknown>

  3. It fixes this issue, but the immediate side effect is that many named unions end up as their expanded form in some contexts (keyof T[] becomes "length" | "toString" | "toLocaleString" | "pop" | "push" | ...)

@fatcerberus
Copy link

fatcerberus commented May 9, 2024

I just want to chime in to note that subtype reduction isn't only a mitigation for design constraints (as Ryan's comment might have implied); it's also the type theoretically correct thing to do:

#53425 (comment)

Given A = { a: number } and AB = { a: number, b: number }, A | AB doesn't actually give you any more usable information than just A by itself, at least not without introducing unsoundness, because { a: number, b: string } is assignable to A and therefore also to A | AB.

Union and intersection types have very interesting type theoretical implications.

@molisani
Copy link
Contributor Author

molisani commented May 9, 2024

If it's the type theoretically correct thing to do, then why doesn't it subtype reduce after the if ("a" in comp) { block? There seems to be some special-casing for "named unions" that keeps the expected behavior in some cases and introduces this side effect in others.

@fatcerberus
Copy link

As @ahejlsberg said, in a perfect world subtype reduction would always happen. However as a matter of practicality (and as I noted in the other issue, pragmatism), it's only done in certain cases. The behavior you describe is likely explained by #53425 (comment).

@fatcerberus
Copy link

Also relevant: #53425 (comment)
Basically, any case where reduction doesn't happen is, effectively, a performance optimization.

@molisani
Copy link
Contributor Author

molisani commented May 9, 2024

However as a matter of practicality (and as I noted in the other issue, pragmatism), it's only done in certain cases

subtype reduction is supposed to occur only in cases where "foreign" types (meaning types that aren't part of the declared type) are injected by narrowing

This explains my first example, as Record<"c", unknown> is a "foreign" type, but it doesn't explain my second example where there are no foreign types being introduced.

subtype reduction can happen any time, but since it is costly (worst case quadratic) we labor hard to avoid it when possible. The "foreign" type detection in CFA is an example of that.

Hypothetical: so wouldn't it be a reasonable performance benefit to skip subtype reduction if we're at the end of a conditional block that did not return?


I understand that there is a high-level "type theory" aspect to this issue, but clearly a line is drawn at some arbitrary point for performance/usability reasons. This subtype reduction doesn't always happen, and as you (@fatcerberus) pointed out here, when it does/doesn't can be observable. In this case, the observability directly results in an unexpected and confusing type error.

I'm hopeful that since this was tagged with Possible Improvement and not Working as Intended that line can possibly be shifted in the future to handle this edge case.

@RyanCavanaugh
Copy link
Member

Hypothetical: so wouldn't it be a reasonable performance benefit to skip subtype reduction if we're at the end of a conditional block that did not return?

Yep! See e.g. #58013

@molisani
Copy link
Contributor Author

molisani commented May 9, 2024

Thanks! Looks like there's some renewed interest in this area recently. As I mentioned earlier, this isn't a blocking issue for us, so for now thanks for all of the useful info.

@ahejlsberg
Copy link
Member

There seems to be some special-casing for "named unions" that keeps the expected behavior in some cases and introduces this side effect in others.

The "named unions" logic is a red herring. It only has to do with computing the origin type of a union, a de-normalized representation we keep around solely for type display purposes.

This explains my first example, as Record<"c", unknown> is a "foreign" type, but it doesn't explain my second example where there are no foreign types being introduced.

In your second example, C is a foreign type because it isn't part of the original declared type of value.

Hypothetical: so wouldn't it be a reasonable performance benefit to skip subtype reduction if we're at the end of a conditional block that did not return?

I'm not sure what you mean by that. Subtype reduction potentially happens when two or more control flows join following a conditional construct. For example, at the bottom of an if statement, the control flows from the true and false branches join and we union together the narrowed types at that point--and, if one or both of the narrowed types from the branches contain "foreign" types, we perform subtype reduction. If one of the branches exits (e.g. because of a return statement), then there's nothing to join at the bottom of the if statement and only the narrowed type from the other branch survives.

@molisani
Copy link
Contributor Author

molisani commented May 9, 2024

My misunderstanding here was about what "foreign" meant in regards to the original type. I was thinking that foreign was determined from the type immediately before the if statement, not the absolute origin.

@fatcerberus
Copy link

I understand that there is a high-level "type theory" aspect to this issue, but clearly a line is drawn at some arbitrary point for performance/usability reasons.

Yeah, I’m a pragmatist myself so I fully understand. I bring up the type theory implications not to be patronizing but only because people often (understandably!) assume that "meow" in someAnimal guarantees they have a cat, when in fact they may just have a freak-of-nature dog that meows, and the type system can do very little to prevent that from happening. That’s not always an issue in practice, but it can be, so it’s worth keeping in mind IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help Wanted You can do this Possible Improvement The current behavior isn't wrong, but it's possible to see that it might be better in some cases
Projects
None yet
Development

No branches or pull requests

4 participants