Subtyping inversion with intersection types and nominal labels

You

I got stuck studying the subtyping system with intersection types. In particular, I want a property which holds for subtyping systems without intersection types. Basically, assume A[X → {X: C}] = B[X → {X:D}] (syntactic equality) where X is a nominal identity that does not appear free as label (i.e. A and B cannot be { X: nat} ) but can appear free as variable in A and B (i.e. can be X, covariantly and contravariantly) In a system without intersection types, we can either have the fact that x does not appear free in A and B, or we get an equivalence relation between C and D. In a subtyping system with intersection types, however, the equivalence cannot be checked by syntactic equality due to the introduction of intersections. I define equivalence to be the mutual subtyping of two subtyping relations. You can just assume the standard subtyping rules for intersection types extended with label types, where the body of the labels are being compared. For the intersection system, I’m finding it hard to have a generalized version of the above inversion property. there are several issues: 1. path avoidance: With intersection types, it is possible that some components in a type do not actually become part of the comparison. e.g. A = (Top → Top) & (X → Top) and B = Top → Top, for A to be a subtype of B, we can just drop the right part of A such that the contravariant occurrence of X doesn’t appear in the comparison. For B to be a subtype of A, although the right part of type A is part of the comparison, the contravariant argument is subsumed by Top. From my point of view (and please correct me if I’m wrong), this example doesn’t break the property but does make the reasoning a bit harder. I would imagine that even more complex examples could be built that avoid certain paths. This path avoidance problem is more common in single-direction subtyping. I’m not sure if there are extra conditions we could exploit for mutual subtyping when defining equivalence; if so, maybe we can get something stronger out of it. In order to deal with the path avoidance problem, another issue is the fact that the lemma is broken now. This is because we could have non-compatible C and D substituted into the above two types, but X also appears free in A, so the lemma would break. Semantically, I think in the simpler system without intersection types, the second condition we get from the inversion is simply expressing that substituting X with any types into A and B doesn’t break their equivalence. The idea I am having is to have a semantic characterization of this property. In particular, I think the inversion principle I’m going to prove for this kind of substitution equivalence is that either C and D are equivalent, or semantically you could have arbitrary substitutions for A and B. This must hold in the labeled form so that the equivalence relation is not broken. But still, it would be nice if we could have a syntactic characterization of the semantic specification for the second branch above. The idea I am having is to define a subtyping relation along with a variable set explicitly that tracks the variables that are avoided in the subtyping. Specifically, the subtyping would be A <: B avoiding S where S is a set of variables For all the sets in the variable, they are not allowed to have a reflexive comparison. Specifically, the reflexive rule for variables is not allowed to use the variables in the set. However, it is still fine for a type to have this variable as a free variable if: 1. It is compared to be a subtype of top, or 2. It is dropped in the intersection subtyping so that it doesn’t actively participate in the subtyping. Alright, so the lemma I would imagine to prove would become either: 1. For the same mutual subtyping relation (which is going to be a conjunction of two subtypings between the two substituted types), we get C to be equivalent to D, defined in terms of the mutual subtyping equivalence. 2. We could add X to the avoidance set for A and B, and we get the mutual subtyping, but with X extended to the avoiding variable set. Now this lemma becomes really hard to prove. If you do induction on the size of A and B, when A is an intersection, it becomes difficult. Say A = A1 & A2 and A [C ] ~ B [D] By intersection subtyping inversion, and assuming B is ordinary (not intersection type) we only get either A1[C] < B[D] or A2[C] < B[D] and B[D] < A1[C] and B[D] < A2[C] if we focus on A1[C] < B[D], we get A1[C] ~ B[D] and IH (second branch) gives A1 ~ B avoiding X but to prove A1 & A2 ~ B avoding X we also need to show B < A2 avoding X, but i guess the best i can do is to drop the nominal labels in B[D] < A2[C], but that doesn;t show X is avoided in B < A2 From an intuition point of view, I guess some other conditions have to be taken into account. For example: 1. A2 is also a subtype of B. 2. B is ordinary and is equivalent to A1 (though this might not be strictly necessary). I am not sure if this stuck case suggests a counterexample in the lemma statement, or if it is just something where we need to be more clever in the proof. Can you give me a detailed analysis of the situation? Please also let me know if you have any questions about the context so far. Do not think about distributivity, because that is not what we are aiming for at the moment. You should stick to standard intersection subtyping rules, but extend them with these special nominal label types.

Analysis of the Inversion Property with Intersection Types

This is a meaty problem. Let me work through it carefully, because I think the stuck case you’ve identified is genuinely informative — it’s pointing at something real about the structure of the lemma, not just a proof-engineering issue. Let me first make sure I have the setup right, then dig into the stuck case.

Setting up the problem

To make sure we’re on the same page, here’s how I’m reading it:

Types include intersections A & B, function types A -> B, Top, type variables X, and nominal label types {X : C} where X is a nominal identity.
The notation A[X -> {X : C}] means: take A, and substitute the type variable X (where it occurs free as a variable) with the label type {X : C}. The label X itself is a nominal identifier that’s not subject to substitution.
Subtyping <: is standard structural subtyping with intersection rules: A & B <: A, A & B <: B, contravariant/covariant for arrows, distributivity is excluded (per your request), and labels are compared by their bodies (so {X : C} <: {X : D} iff C <: D, modulo whatever variance you’ve assigned — I’ll assume invariant or covariant; tell me if it matters).
Equivalence ~ is mutual subtyping.
The annotated relation A <: B \ S is subtyping where reflexive use of variables in S is disallowed (i.e., the rule X <: X is forbidden when X ∈ S), but X may still appear free if it’s “absorbed” by Top or “dropped” by intersection elimination.

The target lemma:

If A[X -> {X:C}] ~ B[X -> {X:D}], then either (1) C ~ D, or (2) A ~ B \ {X} (extending whatever avoidance set you already had).

I’ll write Â = A[X -> {X:C}] and B̂ = B[X -> {X:D}] for brevity.

The stuck case, restated

You have A = A₁ & A₂, B ordinary, and Â ~ B̂. From intersection inversion on Â <: B̂:

Either Â₁ <: B̂ or Â₂ <: B̂ (one branch suffices because B is ordinary).

From B̂ <: Â, you get both B̂ <: Â₁ and B̂ <: Â₂ (intersection introduction requires both).

Suppose the first disjunct holds: Â₁ <: B̂. Combined with B̂ <: Â₁, you have Â₁ ~ B̂, so the IH gives you either C ~ D (done) or A₁ ~ B \ {X}.

Now you want to conclude A₁ & A₂ ~ B \ {X}. The <: direction is fine: A₁ <: B \ {X} implies A₁ & A₂ <: B \ {X} by intersection elimination. The hard direction is B <: A₁ & A₂ \ {X}, which requires both B <: A₁ \ {X} (you have it) and B <: A₂ \ {X} (you don’t).

You have B̂ <: Â₂, but you can’t lift this to B <: A₂ \ {X} because X might genuinely participate in this comparison.

My analysis: this is not a counterexample, but the lemma needs strengthening

I think the stuck case is real, but it’s telling you that the disjunction in the lemma is too coarse. Let me give a concrete instance to see what’s going on, and then suggest a reformulation.

A concrete instance of the stuck pattern

Let A₁ = X, A₂ = Top -> X, B = X, with substituents C = D (for now, to keep things simple — we’ll perturb this).

Then:

Â = {X:C} & (Top -> {X:C})
B̂ = {X:C}

Clearly B̂ <: Â₁ = {X:C} (reflexivity), and B̂ <: Â₂ = Top -> {X:C}? No — {X:C} is a label, not a function, so this fails. So in this instance, B̂ ~ Â doesn’t actually hold. Good — this means the case isn’t trivially constructible.

Let me try harder. The mutual-subtyping condition is strong. For B̂ <: Â₂ to hold while Â₂ mentions X non-trivially, B̂ itself needs to “cover” Â₂, which constrains what B can look like.

The key observation

Here’s what I think is going on. In the stuck case, you’re given:

Â₁ ~ B̂ (so the IH applies to the (A₁, B) pair)
B̂ <: Â₂ (one-directional)
Â₂ <: B̂ is not necessarily individually true — Â₂ <: B̂ could hold, or it could be that only the combination Â₁ & Â₂ <: B̂ holds (via picking Â₁).

In subcase (3a) where Â₂ <: B̂ also holds individually: then Â₂ ~ B̂ too, so by IH on (A₂, B) you get either C ~ D (done) or A₂ ~ B \ {X}. Combined with A₁ ~ B \ {X}, you can build A₁ & A₂ ~ B \ {X} (the <: direction uses either branch; the :> direction uses both, both of which you now have). ✓

In subcase (3b) where Â₂ <: B̂ does not hold individually but B̂ <: Â₂ does: this is your genuinely stuck case. You have B̂ <: Â₂ but no IH for A₂ because there’s no equivalence.

Why subcase (3b) is the heart of the matter

In subcase (3b), Â₂ is strictly larger than B̂ (in the subtyping sense). This means A₂ as a “shape” admits things that B doesn’t. For your example A = (Top -> Top) & (X -> Top), B = Top -> Top: A₂ = X -> Top is strictly larger than B = Top -> Top because B <: A₂ (since X <: Top) but A₂ <: B would require Top <: X, which generally fails.

In this case, A₂ is “absorbing” X into a Top upper bound contravariantly — exactly the “path avoidance” you described. The variable X appears free in A₂ but its only role in subtyping is to be subsumed by Top (or analogously, to be eliminated by intersection on the other side).

A reformulated lemma

I think the right strengthening is to weaken what “avoidance” means, or — more cleanly — to make the avoidance set carry the obligation that participates only via subsumption/elimination, which is what your \ S annotation already attempts. The question is whether the \ S -annotated subtyping is closed under the moves you need.

The crucial property you need is something like:

(Absorption Lemma) If B̂ <: Â (one direction only) and B̂ ~ B'̂ for some B' with X ∉ FV(B') covering the “active” part of Â, then B <: A \ {X} in your annotated system.

That is: when X only appears in A in positions that are subsumed in the comparison, the annotated subtyping should be derivable. Concretely, the rule for S <: Top \ {X} should hold unconditionally (no recursion into S), because Top absorbs everything, including free X. Similarly, S₁ & S₂ <: T \ {X} should be derivable from S₁ <: T \ {X} or S₂ <: T \ {X} (standard intersection elimination), so a branch with free X can be dropped.

If your \ S system has those rules — and from your description it sounds like it does — then the Absorption Lemma should be provable directly by induction on the derivation of B̂ <: Â, using the equivalence with B̂ to bound how much X can actively show up.

But wait — this doesn’t quite work in the stuck case as stated, because we have B̂ <: Â₂ but Â₂ is not equivalent to B̂. However, we have something almost as good: Â₂ sits in an intersection Â₁ & Â₂ whose total is equivalent to B̂, and Â₁ ~ B̂ already. So Â₂ is “redundant on top of” Â₁ in the sense that adding Â₂ to Â₁ doesn’t change the equivalence class.

The reformulation I’d suggest

I think the cleanest reformulation is:

Lemma (revised). If Â ~ B̂, then either:

C ~ D, or

A ~ B \ {X} in the avoidance system, where the avoidance system has the rule that any subtype of Top or any branch of an intersection that’s eliminated may freely contain X.

The stuck case then becomes provable by the following move: when you reach the subgoal B <: A₂ \ {X} and Â₂ contains X in subsumed-only positions (which is forced by B̂ <: Â₂ together with B̂ being X -free in the right way), the derivation of B̂ <: Â₂ can be mimicked structurally in the annotated system, with each X -occurrence in Â₂ being handled by the Top -absorption rule or the intersection-drop rule.

The key technical lemma you’d need is:

(Mimicking Lemma) If B̂ <: Â is derivable, and at every leaf of the derivation where X appears free in the right-hand side, that leaf is either an instance of _ <: Top or an intersection elimination, then B <: A \ {X} is derivable in the annotated system.

This is provable by straightforward induction on the derivation of B̂ <: Â, provided the annotated system has the rules I mentioned. The remaining question is: in the stuck case, is the assumption of the Mimicking Lemma actually satisfied?

I claim yes, and here’s the intuition: B doesn’t mention X (in the sense that any X -occurrences in B get substituted with {X:D}, but the structure of B doesn’t have X as a variable hanging around to “actively use”). So when we derive B̂ <: Â₂, every step on the right-hand side either matches B̂ ‘s structure (in which case X shows up in Â₂ only where B̂ has {X:D}, which is fine — the labels are getting compared) or is an absorption/elimination step.

But this last claim needs to be formalized carefully: it depends on B ‘s X -occurrences being “consumed” by matching Â ‘s X -occurrences in the same positions. And this is exactly the case where the IH would give you C ~ D instead of needing the avoidance branch. That is, if X actively appears on both sides in matching positions, the comparison of {X:C} against {X:D} is forced, and you get C ~ D.

So I think the lemma’s two branches really are exhaustive, and the proof goes through with a strengthened mimicking principle.

Concrete suggestions for moving forward

Let me lay out three things that would help nail this down:

(i) Verify your annotated system’s rules. Specifically, write out the rules for <: \ S and confirm:

T <: Top \ S for any T, regardless of free variables in T.
T₁ & T₂ <: U \ S if T₁ <: U \ S or T₂ <: U \ S.
The variable rule X <: X \ S requires X ∉ S.
All other rules just propagate S.

(ii) Prove the Mimicking Lemma. State it precisely as: if B̂ <: Â and X ∉ FV(B), and a certain “subsumption-only” condition on X -occurrences in Â holds, then B <: A \ {X}. Induct on the derivation.

(iii) Tackle the stuck case using mimicking. In the case A = A₁ & A₂, Â₁ ~ B̂, B̂ <: Â₂ (with Â₂ <: B̂ not necessarily holding), apply the Mimicking Lemma directly to B̂ <: Â₂ to get B <: A₂ \ {X}. The condition on X -occurrences in Â₂ should follow from the fact that B is X -free as a variable (only X -occurrences in B̂ come from substitution, which match the labels in Â).

Questions I have for you

A few things I’d want to pin down before being more confident:

Variance of labels. Is {X : C} <: {X : D} covariant in the body (i.e., requires C <: D), invariant (requires C ~ D), or something else? The answer affects how labels propagate equivalence.
Where exactly does X occur in A and B? You said “covariantly and contravariantly,” which I take to mean X can appear anywhere as a variable. Is the substitution X -> {X:C} really replacing all free occurrences uniformly?
The shape of B in the stuck case. Is B an arbitrary ordinary type (variable, arrow, label, but not intersection), or is there a more specific form? In the example you gave (A = (Top -> Top) & (X -> Top), B = Top -> Top), B is a closed arrow. Does the difficulty persist when B mentions X (e.g., B = X -> Top)?
Is “ordinary” defined as “not a top-level intersection”? I’m assuming so, but want to confirm — there are subtleties when B itself contains intersections in negative positions.
When you say “the second condition… is simply expressing that substituting X with any types into A and B doesn’t break their equivalence” — do you have in mind a parametricity-style semantic statement? That framing would suggest a logical-relations approach as an alternative to the syntactic avoidance set, and the avoidance set is essentially trying to be a syntactic shadow of that. If so, it might be worth proving the syntactic and semantic versions equivalent as a sanity check.

My overall take: the stuck case is real, but I believe it’s a proof-engineering issue, not a counterexample. The fix is to strengthen the inductive structure with a Mimicking Lemma that lets you “lift” one-directional subtyping derivations into the annotated system whenever the variable X only appears in absorbed/eliminated positions. The stuck subcase (3b) is exactly such a situation, by the structure of mutual subtyping with an ordinary B. But I’d like to hear your answers to the questions above before being fully confident, and it’d be worth attempting a careful write-out of the Mimicking Lemma to see whether the induction really goes through.

You

For the first question, it is covariance subtyping.
For the second question, I just meant to say that x can appear anywhere as a variable, whether as function input types or output types, contravariantly or covariantly. I am inviting you to think a bit harder on that, because this is where interactions with intersection types can get tricky.
For the third question, I think the ordinary type is a good place to start. For intersection, if B is an intersection type, it can be destructed. Since we already got stuck on ordinary types, it will be even harder for intersection types. We have to solve the ordinary type case first, so while you could explore different shapes of B, I suggest starting with ordinary types just to solve the case where we are stuck.
Regarding your assumption for the first question, you are correct: in my paper, “ordinary” is defined as not being a top-level intersection. When B is a function type that contains intersections in negative or positive positions, but is overall still a function type, we do not consider it to be ordinary. Ordinary basically means you can deterministically choose to apply the left elimination rule for intersections. If B is not ordinary, the best strategy to algorithmically test subtyping is to first destruct B to generate two proof goals. This approach makes things more algorithmic and is a good way to split the cases. Regarding the fifth question, I do have a semantic specification in my mind. The thing I want to achieve is this: if we have A to be a subtype of B avoiding S, then I am saying that for any X in the set S, I can have for any E F A[X → {X:E}] <: B[X → {X:F}] I’m not sure about this specification; specifically, I’m not sure about the proof of equivalence to the syntactic rules I’m proposing. This is a valid suggestion, but I think in general, because the equivalence checking is algorithmic, I don’t see why adhering to this property would be non-algorithmic. In the most ideal case, my ultimate goal would be to have this checking implementable by syntactic rules. It’s just that I’m finding the current syntactic rules very hard to work with due to the stuck case I described to you.

Claude

Refined Analysis

Thank you for the clarifications — they sharpen the problem considerably. Let me reconsider with these in mind, and I think we’ll see the stuck case is actually pointing at something more subtle than I first credited.

Revisiting the assumptions

A few things I now appreciate better:

Covariant labels. {X:C} <: {X:D} iff C <: D. So mutual subtyping of labels gives C ~ D. Good.
X truly occurs anywhere in A and B. This matters because it means in the stuck case Â₂ and B̂ can have X showing up in different positions, and the interactions between intersection-elimination and subsumption can scramble where X shows up versus where it gets compared.
Ordinary B only for now.
The semantic spec is: A <: B \ S should mean ∀X ∈ S, ∀E, F. A[X -> {X:E}] <: B[X -> {X:F}].

That semantic spec is the one to keep front and center, because it’s our ground truth. Anything we propose syntactically must be sound and complete with respect to it (or at least, we should know which direction fails).

A counterexample candidate I want to put on the table

Let me try harder to break the lemma before defending it. Consider:

A₁ = X (a variable, contravariantly… wait, it’s just a variable, polarity comes from context)
A₂ = X -> Top
A = A₁ & A₂ = X & (X -> Top)
B = X (ordinary)

Then with substitution X -> {X:C} on the left and X -> {X:D} on the right:

Â = {X:C} & ({X:C} -> Top)
B̂ = {X:D}

Check B̂ <: Â:

B̂ <: Â₁: {X:D} <: {X:C} requires D <: C.
B̂ <: Â₂: {X:D} <: ({X:C} -> Top)? No — a label is not a function type. So this fails.

OK, that doesn’t give equivalence. Let me try again with A₂ chosen so that B̂ <: Â₂ is forced to hold by Top -absorption:

A₁ = X, A₂ = X -> Top doesn’t work because B̂ = {X:D} can’t subtype an arrow.

What if B is itself an arrow? But you’ve asked me to focus on ordinary B. Ordinary just means not a top-level intersection, so B can be a label, arrow, variable, or Top.

Let me try B = Top -> Top, A₁ = Top -> Top, A₂ = X -> Top:

Â = (Top -> Top) & ({X:C} -> Top) (no X in A₁)
B̂ = Top -> Top

Mutual subtyping:

Â <: B̂: pick Â₁ = Top -> Top <: Top -> Top. ✓
B̂ <: Â₁: Top -> Top <: Top -> Top. ✓
B̂ <: Â₂: Top -> Top <: {X:C} -> Top. Contravariantly need {X:C} <: Top ✓, covariantly Top <: Top ✓. ✓

Great, so Â ~ B̂.

This is your original example. Now in the lemma, we want either C ~ D (no constraint between them — D doesn’t even appear, in fact B has no X) or A ~ B \ {X}.

Here D is unconstrained, so we need A ~ B \ {X}, i.e., (Top -> Top) & (X -> Top) ~ Top -> Top \ {X}.

(Top -> Top) & (X -> Top) <: Top -> Top \ {X}: pick the left branch, Top -> Top <: Top -> Top \ {X}. ✓ (no X involved).
Top -> Top <: (Top -> Top) & (X -> Top) \ {X}: need both Top -> Top <: Top -> Top \ {X} ✓ and Top -> Top <: X -> Top \ {X}. The latter requires contravariantly X <: Top \ {X} (which holds, by the Top -absorption rule we want) and covariantly Top <: Top \ {X} ✓.

So this case works if the avoidance system has unconditional _ <: Top \ S. Good.

Now let me really try to break it

What if X appears in A₂ in a position that’s not absorbed by Top and not dropped by intersection elimination? Then B̂ <: Â₂ would force a real comparison involving {X:D}.

Example:

A₁ = X, A₂ = X (just to keep it simple — but this collapses A₁ & A₂ to X essentially, modulo the intersection structure)

Actually let me try:

A₁ = {X:Top} (a label, not the substitution!) — no wait, the problem says X doesn’t appear free as a label, only as a variable. So I can’t write {X:Top} as part of A.

Hmm, this constraint is important. A and B cannot have X appearing as a label name. So when Â = A[X -> {X:C}] produces label {X:C}, every label {X:_} in Â came from substitution.

This is a real structural constraint. Let me use it.

A₁ = X -> X, A₂ = X, B = X.
Â = ({X:C} -> {X:C}) & {X:C}
B̂ = {X:D}

Check B̂ <: Â:

B̂ <: Â₁: {X:D} <: ({X:C} -> {X:C})? No, label vs arrow. ✗

So this isn’t equivalent either. Let me think about when equivalence can actually hold with B ordinary.

Key observation: If B is ordinary and B̂ ~ Â₁ & Â₂, then B̂ must subtype both Â₁ and Â₂, and at least one of Âᵢ must subtype B̂. So B̂ lies “between” the two Âᵢ ‘s, with B̂ being below their intersection and above at least one of them.

In the case where B̂ ~ Â₁ (so Â₁ ~ B̂ mutually) and B̂ <: Â₂ (one direction), Â₂ is some “weaker” type than B̂ — strictly weaker, or equal. If Â₂ <: B̂ also holds, we’re back in the easy subcase. If not, Â₂ must be a proper supertype of B̂.

Question: What types are proper supertypes of an ordinary type B̂? Given the rule structure (no distributivity), if B̂ is an arrow B₁ -> B₂, supertypes are: Top, arrows B₁' -> B₂' with B₁' <: B₁ and B₂ <: B₂', intersections of supertypes. If B̂ is a label, supertypes are: Top, labels with the same name and supertype body, intersections.

Where can X appear in such a supertype Â₂ of B̂ while not being “absorbed”? Let’s enumerate:

Â₂ = Top: X doesn’t appear (or if A₂ had X, but became Top after substitution? No — substitution doesn’t turn anything into Top). So X doesn’t appear. Fine.
Â₂ = B₁' -> B₂' (arrow supertype of B̂ ‘s arrow): X could appear in B₁' or B₂'. Contravariantly in B₁': we need B₁' :> B₁, so B₁' is a supertype of B̂ ‘s domain. Covariantly in B₂': B₂' :> B₂.

This is the case to worry about. Let me construct it:

B = U -> V (ordinary, an arrow).
A₁ = U -> V (so Â₁ = Û -> V̂ where Û, V̂ are U, V with substitution).
A₂ = U' -> V' where U' :> U and V' :> V, and X appears in U' or V'.

Specifically:

B = Top -> Top (so B̂ = Top -> Top)
A₁ = Top -> Top
A₂ = (X -> Top) -> Top. Then Â₂ = ({X:C} -> Top) -> Top. Is B̂ <: Â₂? Contravariantly: ({X:C} -> Top) <: Top ✓. Covariantly: Top <: Top ✓. Yes, B̂ <: Â₂. And Â₂ <: B̂? Contravariantly: Top <: ({X:C} -> Top) requires Top <: {X:C} (after taking the arrow), which is false. So Â₂ <: B̂ fails. We’re in subcase (3b).

Now: does A ~ B \ {X} hold? Specifically, we need B <: A₂ \ {X} = Top -> Top <: (X -> Top) -> Top \ {X}.

Contravariantly: (X -> Top) <: Top \ {X}. ✓ (Top-absorption)
Covariantly: Top <: Top \ {X}. ✓

Yes! So A ~ B \ {X} holds, because the X in A₂ is, despite appearances, eventually absorbed by a Top that sits outside of X. The arrow’s domain X -> Top itself appears contravariantly to B ‘s Top domain, and Top absorbs the whole X -> Top.

This is the absorption pattern, but at one remove. Mimicking should still work here, because the derivation of B̂ <: Â₂ itself ends with a _ <: Top step that absorbs the entire subexpression containing X.

The real counterexample attempt

Let me push harder. What if the absorption is “imperfect” — X appears in A₂ somewhere that gets compared to X in B, and A₂ is not equivalent to B?

B = X -> Top, so B̂ = {X:C} -> Top. Wait, B should have substitution applied with D: B̂ = {X:D} -> Top.
A₁ = X -> Top, so Â₁ = {X:C} -> Top.
A₂ = X -> X, so Â₂ = {X:C} -> {X:C}.

Mutual subtyping Â ~ B̂?

Â₁ <: B̂: {X:C} -> Top <: {X:D} -> Top requires {X:D} <: {X:C} (contravariant), i.e., D <: C.
Â₂ <: B̂: {X:C} -> {X:C} <: {X:D} -> Top requires {X:D} <: {X:C} (i.e., D <: C) and {X:C} <: Top ✓.
For Â <: B̂, we need one of the above. Either way we need D <: C.
B̂ <: Â₁: requires {X:C} <: {X:D}, i.e., C <: D.
B̂ <: Â₂: requires {X:C} <: {X:D} (from contravariance) and Top <: {X:C} (from covariance, which is false).

So B̂ <: Â₂ fails, and equivalence fails. Not a counterexample.

Let me modify: make A₂ such that B̂ <: Â₂ holds.

A₂ = X -> Top (same as A₁). Then Â₂ = Â₁ and we’re not exploring new territory.
A₂ = (X -> X) -> Top. Then Â₂ = ({X:C} -> {X:C}) -> Top. B̂ <: Â₂: contravariant ({X:C} -> {X:C}) <: {X:D} -> Top? Requires {X:D} <: {X:C}, i.e., D <: C, and {X:C} <: Top ✓. So we need D <: C. Covariant: Top <: Top ✓. So B̂ <: Â₂ requires D <: C. And Â₂ <: B̂: contravariant {X:D} <: ({X:C} -> {X:C})? Label vs arrow. ✗.

So with D <: C (and we already needed C <: D from B̂ <: Â₁), so C ~ D. We’re done — first branch.

Hmm, the lemma seems to keep escaping by giving us C ~ D. Is that always what happens when there’s a non-trivial interaction? Let me think about why.

The structural reason the lemma should hold

Here’s an argument I find convincing. Consider the “subsumption pattern” in the derivation of B̂ <: Â₂:

When we trace B̂ <: Â₂, we walk down the type structure of Â₂. Every time we encounter {X:C} on the right (label, contravariant after some arrow flips), we need {X:_} <: {X:C} on the left from B̂ ‘s side, which requires B̂ to have {X:?} in the corresponding position, meaning B has X (as a variable) in that position. Then the comparison forces the corresponding bodies to relate.

So every “active” occurrence of X in Â₂ (i.e., one that’s not absorbed by Top or dropped by intersection) must be matched by a corresponding X in B̂ in the right position with the right polarity. That match generates a comparison {X:C} ⋈ {X:D} (for some direction or both), and if both directions of the match arise from the mutual subtyping, we get C ~ D.

The question is: in the stuck case, do both directions of the match arise?

We have:

B̂ <: Â₁ and Â₁ <: B̂ (so any X -occurrence in Â₁ matched by an X in B̂ gives C ~ D from the IH on Â₁ ~ B̂).
B̂ <: Â₂ only (not Â₂ <: B̂).

So an active X in Â₂ matched by X in B̂ only gives us one direction: {X:C} and {X:D} related in one direction. That alone gives C <: D or D <: C depending on the polarity, not C ~ D.

However, we also have B̂ <: Â₁ from the other part of the intersection. If B ‘s X -occurrence (in the same position) is also matched in Â₁ (which has X -occurrences too, possibly in the same positions), then between Â₁ ~ B̂ (full mutual) and the matched comparisons in Â₁, we do get C ~ D.

But what if B ‘s X -occurrence at a given position matches an X in Â₂ but not in Â₁?

Let me construct that:

B = X -> X (so X in both domain and codomain).
A₁ = Top -> X (X only in codomain).
A₂ = X -> Top (X only in domain).

Then Â₁ = Top -> {X:C}, Â₂ = {X:C} -> Top, B̂ = {X:D} -> {X:D}.

Â₁ <: B̂: Top -> {X:C} <: {X:D} -> {X:D}? Contravariant {X:D} <: Top ✓, covariant {X:C} <: {X:D} requires C <: D.
Â₂ <: B̂: contravariant {X:D} <: {X:C} requires D <: C, covariant Top <: {X:D} ✗.
Â <: B̂: needs one of the above. From Â₁: C <: D.
B̂ <: Â₁: contravariant Top <: {X:D} ✗. Fails!

So equivalence fails. Not a counterexample.

The issue: when B ‘s structure is “richer” than A₁ ‘s, B̂ <: Â₁ may already fail.

Let me retry, ensuring B̂ <: Â₁ holds:

B = X -> X (so B̂ = {X:D} -> {X:D}).
A₁ = X -> X (so Â₁ = {X:C} -> {X:C}).
A₂ = ? such that B̂ <: Â₂ and Â <: B̂ (using Â₁ or Â₂).

Â₁ ~ B̂: requires {X:C} ~ {X:D} from contravariant matching, i.e., both C <: D and D <: C. So C ~ D directly. First branch holds. We’re done.

The structural reason: if Â₁ ~ B̂ and X appears actively (not absorbed) in any matching position in Â₁ and B̂, the equivalence in labels is forced and we get C ~ D from the IH.

So the only way to be in subcase (3b) without already getting C ~ D is if Â₁ ~ B̂ holds without forcing any active X -comparisons, which means X in A₁ is only in absorbed positions (and X in B likewise). But then any X -occurrence in Â₂ matched by X in B̂ (in Â₂ -active position)… hmm wait, but B̂ only has X in absorbed positions (relative to Â₁), so actually B ‘s X -positions are constrained.

Let me make this more precise.

Sharpening the argument

Claim: If Â₁ ~ B̂ and Â₁ doesn’t force C ~ D, then B ‘s active X -occurrences (relative to any comparison) are limited.

Hmm, this isn’t quite right either, because “active” is relative to a particular subtyping derivation. Let me think differently.

Reframing: Use the semantic spec as the guide. The semantic spec says A <: B \ {X} iff ∀E, F. A[X -> {X:E}] <: B[X -> {X:F}].

So A ~ B \ {X} semantically means: for all E, F, A[X -> {X:E}] ~ B[X -> {X:F}]. Wait — equivalence of A and B under arbitrary independent E and F substitutions is a very strong statement. It says X is genuinely a “free parameter” that doesn’t constrain the equivalence.

In your example A = (Top -> Top) & (X -> Top), B = Top -> Top: for arbitrary E, F, is A[X -> {X:E}] ~ B[X -> {X:F}]? Yes, because X in A is in a contravariant position dominated by Top. ✓

In the original lemma’s stuck case: we want, for arbitrary E, F, A₁[X -> {X:E}] & A₂[X -> {X:E}] ~ B[X -> {X:F}]. We have Â₁ ~ B̂ for specific C, D. The IH (semantic version) says: either C ~ D, or for arbitrary E, F, A₁[X -> {X:E}] ~ B[X -> {X:F}]. Take this second branch.

We additionally have B̂ <: Â₂ for specific C, D. We need: for arbitrary E, F, B[X -> {X:F}] <: A₂[X -> {X:E}]. Is this derivable from B̂ <: Â₂ for one specific (C, D)?

This is exactly where things get subtle, and where I want to backtrack on my earlier confidence.

Suppose B̂ <: Â₂ holds because some X -position in Â₂ matches an X -position in B̂, and the comparison happens to work for (C, D). Will it work for arbitrary (E, F)?

Concrete instance: A₂ = X, B = X. Then Â₂ = {X:C}, B̂ = {X:D}. B̂ <: Â₂ requires D <: C. For arbitrary E, F, we’d need F <: E, which is not true for arbitrary E, F. So semantically, B <: A₂ \ {X} does not hold.

But in this scenario, can we be in the stuck case? Let’s check: we’d need Â₁ ~ B̂ and B̂ <: Â₂ (with Â₂ <: B̂ failing). With A₂ = X, B = X, Â₂ = {X:C}, B̂ = {X:D}. Â₂ <: B̂ requires C <: D. So if both C <: D and D <: C hold, we have C ~ D (first branch of lemma!). Otherwise one of them fails. If D <: C but not C <: D, then B̂ <: Â₂ and Â <: B̂ (we need to pick Âᵢ <: B̂).

For Â <: B̂ we need Â₁ <: B̂ (since Â₂ <: B̂ fails). And for Â₁ ~ B̂, we need B̂ <: Â₁ and Â₁ <: B̂.

So pick A₁ such that Â₁ ~ B̂ holds without forcing C ~ D. E.g., A₁ = X makes Â₁ = {X:C} and Â₁ ~ B̂ requires C ~ D. Bad. Try A₁ = Top -> Top, B = X doesn’t match. Try A₁ = X, B = Top -> Top: then Â₁ = {X:C}, B̂ = Top -> Top. Â₁ ~ B̂ requires {X:C} ~ Top -> Top which fails (different head constructors).

Constraint: B is ordinary. If B is a variable X, then B̂ = {X:D}, a label. For Â₁ ~ B̂, Â₁ must be a label or contain one essentially as its top form. With our restriction (no top-level intersection in B, but A₁ can be anything), Â₁ could be {X:C} (i.e., A₁ = X) or {X:C} & ... (but then A₁ is intersection — fine, that’s allowed for A₁). Hmm wait, A₁ is a component of A, and A is itself an intersection; A₁ could be ordinary or itself an intersection.

Let me focus: if A₁ = X and B = X, then Â₁ ~ B̂ forces C ~ D. If A₁ is a more complex shape like X & Top (intersection), then Â₁ = {X:C} & Top ~ {X:C} — same as A₁ = X essentially. Doesn’t help.

What if A₁ = X & (Top -> Top)? Then Â₁ = {X:C} & (Top -> Top). Â₁ ~ B̂ with B̂ = {X:D}: Â₁ <: B̂: pick {X:C} <: {X:D}, i.e., C <: D. B̂ <: Â₁: needs B̂ <: {X:C} (i.e., D <: C) AND B̂ <: Top -> Top (i.e., {X:D} <: Top -> Top, fails). So Â₁ ~ B̂ fails. Damn.

It’s looking like, with B = X and B ordinary (so B̂ = {X:D} is a label), having Â₁ ~ B̂ essentially forces Â₁ ‘s structure to match B̂ ‘s, and then C ~ D is forced.

So the genuine stuck case requires B ‘s X -occurrences to all be in absorbed positions. And in that case, mimicking B̂ <: Â₂ works.

I think this is the crux: if Â₁ ~ B̂ holds without forcing C ~ D, then B ‘s X -occurrences must all be in positions that get absorbed in any subtyping comparison B̂ <: ?, including B̂ <: Â₂.

This is a real lemma worth stating and proving:

Lemma (X-absorption from equivalence). If Â₁ ~ B̂ (for specific C, D) and C ~ D does not hold, then in any subtyping derivation B̂ <: Â', every X -occurrence in B̂ (which all come from substitution since B ‘s X ‘s are variables) is in an absorbed/dropped position.

If we can prove this, then mimicking B̂ <: Â₂ immediately yields B <: A₂ \ {X} because B ‘s X -occurrences only show up in absorbed/dropped positions, exactly what the avoidance system permits.

Where this leaves us

Let me back off my earlier strong claim and state things more carefully:

The lemma is plausible but not yet proven. The stuck case is real and requires either a strong absorption-style auxiliary lemma or a different proof strategy (perhaps direct via the semantic spec).
The semantic spec is probably the cleanest reference point. Given the difficulty of the syntactic proof, I’d recommend:
- First, prove the soundness of the syntactic avoidance system w.r.t. the semantic spec (A <: B \ S syntactically implies the universal quantification semantically). This should be straightforward by induction on the syntactic derivation.
  - Then, prove the inversion lemma directly via the semantic spec: assuming Â ~ B̂ for specific C, D with C ≁ D, show that for all E, F, A[X -> {X:E}] ~ B[X -> {X:F}].
  - Finally, the question is whether the semantic side coincides with the syntactic avoidance system. If completeness fails (the syntactic system is too weak), you have a roadmap to strengthen it.
An alternative: change the syntactic system. Instead of “avoidance set,” consider a system parameterized over two substitutions: A <: B [θ_L, θ_R] where θ_L, θ_R are environments mapping X ∈ S to types E_X, F_X. The reflexive variable rule for X ∈ S is replaced by checking {X:E_X} <: {X:F_X} per the substitutions. This is closer to a “bisubtyping” system and might be more straightforward to reason about, though at the cost of being parameterized rather than purely propositional. The semantic spec then becomes the universal closure.
Don’t rule out a counterexample yet. I tried hard above and couldn’t construct one given the constraint that X doesn’t appear as a label in A or B and B is ordinary, but I’ve not exhausted the search. In particular, I haven’t fully explored cases where Â₁ has some active X -occurrence forcing C <: D (but not D <: C), and Â₂ has another active X -occurrence forcing D <: C (but not C <: D), giving C ~ D together but neither alone. That’d be the first branch, but the structure of the reasoning is delicate.

Before investing more in the syntactic system, prove the semantic version of the inversion lemma. That is:

If for specific C, D we have A[X -> {X:C}] ~ B[X -> {X:D}], then either C ~ D, or for all E, F, A[X -> {X:E}] ~ B[X -> {X:F}].

This is a clean, ground-truth statement. Once you have it (or find it false), you have a much firmer footing to either:

Calibrate your syntactic avoidance system to match it.
Or, if the semantic version is provable but the syntactic is not, accept the parameterized “bisubtyping” formulation as the natural syntactic counterpart.

I suspect the semantic version is provable by induction on the size of A + B directly, with the stuck case dissolving because the semantic statement gives you universal quantification “for free” — you don’t need to mimic a specific derivation, you just need to argue that the absorption pattern holds for all E, F.

Questions back to you

Have you tried proving the semantic version directly? If so, where does that get stuck?
In your paper, is the avoidance set notation primarily for the algorithmic check, or is it also used in a metatheoretic statement (e.g., a substitution lemma)? If primarily algorithmic, the parameterized bisubtyping might be a better fit since it’s more directly faithful to the semantics.
Does your subtyping system have a reflexivity lemma A <: A that’s proven structurally? If so, the avoidance set’s interaction with that lemma is the key question — X <: X \ {X} should fail, but you want all other uses of reflexivity to go through. This is the kind of thing that’s easy to get wrong in the system design.
Does it help to think of \ {X} as “we cannot use X ‘s assumption from the context” — essentially, the avoidance set is a list of “removed” entries from a typing context where X was bound to itself? That viewpoint might suggest a cleaner formulation as a system with explicit hypotheses.

I think we need to slow down and not assume the lemma holds with the avoidance system as currently formulated. The stuck case might be telling us the system needs adjustment, or that the right phrasing is semantic. Which avenue would you like to pursue first?

You

For the first question, I haven’t really proven it, but I think the same issue would arise if I am doing induction on the size of the types. Because the premise is the same, the induction hypothesis would be similar. Eventually, we got a semantic version of avoidance subtyping for A, Y, and B, but there is no clue for us to show—or is there? We don’t know if, for that particular added variable X, we could still do the safe intersection dropping for B. I am not sure if it works, but perhaps it does. Can you check that for me? Regarding the second question, is the avoidance set primarily for the algorithmic check or is it used in the meta-theory? I initially thought it was primarily for the algorithmic check, but I actually want the avoidance set to describe the property that these variables do not appear freely, but in an active way. So, it is better to keep the avoidance set if it really works. For the third question, does it have a reflexivity lemma? No, it doesn’t, because we are ruling out some variables, so not all variables are reflexive. However, if the free variables of two types are disjoint with the avoidance variable set, then we should have the reflexivity property. Finally, for the fourth question, I don’t think it is equivalent to the context because we can still have top absorption rules or intersection elimination rules. The type may still be well-formed and it still may use X, but it’s just not actively participating in the subtyping.

Claude

Working Through Your Questions

Let me take these in order, with the first one being the most substantive.

Question 1: Does the semantic induction go through?

Let me try to actually do the induction and see where it breaks. I’ll be more careful this time.

Semantic statement to prove:

(SemInv) If for specific C, D we have A[X -> {X:C}] ~ B[X -> {X:D}], then either C ~ D, or for all E, F, A[X -> {X:E}] ~ B[X -> {X:F}].

By induction on the size of A + B. The stuck case: A = A₁ & A₂, B ordinary, Â₁ ~ B̂, B̂ <: Â₂, Â₂ <: B̂ fails.

Apply IH to (A₁, B): either C ~ D (done), or (*) for all E, F, A₁[X -> {X:E}] ~ B[X -> {X:F}].

We need: for all E, F, (A₁ & A₂)[X -> {X:E}] ~ B[X -> {X:F}].

The <: direction: (A₁ & A₂)[X -> {X:E}] <: B[X -> {X:F}]. Pick A₁[X -> {X:E}] <: B[X -> {X:F}], which holds by (*). ✓

The :> direction: B[X -> {X:F}] <: (A₁ & A₂)[X -> {X:E}], which requires both B[X -> {X:F}] <: A₁[X -> {X:E}] (have, by (*)) and B[X -> {X:F}] <: A₂[X -> {X:E}] (need to show).

Here’s the key question you’re asking: can we use B̂ <: Â₂ (one specific (C, D)) to conclude B[X -> {X:F}] <: A₂[X -> {X:E}] for arbitrary E, F?

In general, no — the specific instance doesn’t lift. But you’re hoping there’s a structural reason this does lift, given the surrounding context (specifically, that (*) holds).

Let me check whether (*) constrains the situation enough.

(*) says: for all E, F, A₁[X -> {X:E}] ~ B[X -> {X:F}]. In particular, take E = D and F = D: then A₁[X -> {X:D}] ~ B[X -> {X:D}]. Take E = C and F = C: A₁[X -> {X:C}] ~ B[X -> {X:C}]. Take any specific E, F: equivalence with mismatched substitutions on each side.

This is a strong condition on A₁ and B. In a sense, it says A₁ and B are “morally equivalent up to anything you stuff into the X slot, on either side independently.”

Now I want to try to lift B̂ <: Â₂ to arbitrary E, F.

Try this approach: pick any E, F. Consider the chain

B[X -> {X:F}]    ≤?    A₂[X -> {X:E}]

We have B[X -> {X:D}] <: A₂[X -> {X:C}] from the original assumption.

By (*) with arbitrary substitutions, A₁[X -> {X:E}] ~ B[X -> {X:F}]. So

B[X -> {X:F}] ~ A₁[X -> {X:E}].

If we could chain through A₁, we’d want A₁[X -> {X:E}] <: A₂[X -> {X:E}] (a same-substitution comparison). But there’s no reason for this to hold — A₁ and A₂ are unrelated branches of the intersection, with no a priori comparability.

Alternatively, by (*) with E := E, F := D: A₁[X -> {X:E}] ~ B[X -> {X:D}] = B̂. So

A₁[X -> {X:E}] ~ B̂ <: Â₂ = A₂[X -> {X:C}].

So A₁[X -> {X:E}] <: A₂[X -> {X:C}]. And by (*) with E := E, F := F: B[X -> {X:F}] <: A₁[X -> {X:E}]. Chain:

B[X -> {X:F}] <: A₁[X -> {X:E}] <: A₂[X -> {X:C}].

We get B[X -> {X:F}] <: A₂[X -> {X:C}]. But we wanted <: A₂[X -> {X:E}], not <: A₂[X -> {X:C}].

So the chain gives <: A₂[X -> {X:C}] for arbitrary F, but with C (specific) on the right, not arbitrary E. We’re stuck on the A₂ side.

Hmm. Let me try the reverse chain with (*) instantiated as E := C, F := F: A₁[X -> {X:C}] ~ B[X -> {X:F}]. Combined with Â₁ = A₁[X -> {X:C}] <: B̂ = B[X -> {X:D}], we get B[X -> {X:F}] <: B[X -> {X:D}] (well, after going through A₁[X -> {X:C}] both ways). That’s a constraint on B, but it has to hold for all F, meaning B ‘s structure is preserved under arbitrary X -substitution on its label-slot — i.e., X is in absorbed position in B.

This is exactly the absorption observation! Let me formalize:

Sub-claim from (*): If (*) holds (A₁[X -> {X:E}] ~ B[X -> {X:F}] for all E, F), then for all F, F', B[X -> {X:F}] ~ B[X -> {X:F'}].

Proof: B[X -> {X:F}] ~ A₁[X -> {X:E}] ~ B[X -> {X:F'}] for any fixed E. Done.

So B ‘s equivalence class is invariant under the choice of substitute for X. By a separate (and provable) lemma, this means X does not actively participate in B ‘s subtyping behavior, equivalently, that the active occurrences of X in B are all in absorbed positions.

Symmetrically, for A₁: A₁[X -> {X:E}] ~ A₁[X -> {X:E'}] for all E, E'. So X is in absorbed position in A₁ too.

Now I can lift B̂ <: Â₂ to arbitrary substitutions. Here’s the argument:

Given (*) and the substitution-invariance of B, we have B[X -> {X:F}] ~ B̂ = B[X -> {X:D}] for all F. So:

B[X -> {X:F}] ~ B̂ <: Â₂ = A₂[X -> {X:C}].

So B[X -> {X:F}] <: A₂[X -> {X:C}] for all F. ✓ on the F side.

But we still need B[X -> {X:F}] <: A₂[X -> {X:E}] for arbitrary E. We’ve established it for E = C. Can we lift to arbitrary E?

This is exactly where it gets interesting, and it’s not obvious.

We don’t yet have a direct argument that A₂[X -> {X:C}] and A₂[X -> {X:E}] are equivalent or related. The substitution-invariance argument worked for A₁ and B because we had (*) to ride on. We don’t have an analogue for A₂.

In fact, we shouldn’t expect A₂ to be substitution-invariant in general, because A₂ could perfectly well have X in active positions. The issue is whether, in those active positions, anything bad happens when we change E.

Let me think… if A₂ has X in an active position (covariantly, say), then A₂[X -> {X:C}] vs A₂[X -> {X:E}] differ in the “size” of that label, and they’re generally incomparable for arbitrary E, C.

But wait — does the situation force A₂ to be substitution-invariant too?

Let’s think about what B̂ <: Â₂ with Â₂ <: B̂ failing tells us. Â₂ is a strict supertype of B̂. Given B ‘s substitution-invariance, Â₂ is a supertype of B[X -> {X:F}] for all F. So Â₂ :> ⋂_F B[X -> {X:F}]… hmm, this isn’t quite giving me what I want.

Let me try a different angle. Construct a specific instance and check.

Concrete instance to check

Let me try to construct something where (*) holds but A₂ is genuinely sensitive to its substituent.

A₁ = Top -> Top, B = Top -> Top. Then Â₁ = B̂ = Top -> Top, equivalent. (*): A₁[X -> {X:E}] = Top -> Top = B[X -> {X:F}]. Substitution-invariant trivially. ✓
A₂ = ? such that B̂ <: Â₂ (i.e., Top -> Top <: A₂[X -> {X:C}]) and Â₂ <: B̂ fails.

For Top -> Top <: Â₂, Â₂ must be Top or an arrow with Top -supertype codomain and… wait, Top -> Top <: T -> U requires T <: Top ✓ and Top <: U, so U = Top (or absorbs Top). And contravariantly T <: Top is trivially true, so T is anything. So Â₂ = T̂ -> Top for any T̂.

For Â₂ <: B̂ to fail: T̂ -> Top <: Top -> Top requires Top <: T̂, so we need T̂ not a supertype of Top. Most things satisfy that.

Pick A₂ = X -> Top, so Â₂ = {X:C} -> Top. Indeed {X:C} is not a supertype of Top (under standard rules). So Â₂ <: B̂ fails. ✓

Now check (SemInv): we want, for arbitrary E, F, (A₁ & A₂)[X -> {X:E}] ~ B[X -> {X:F}], i.e., (Top -> Top) & ({X:E} -> Top) ~ Top -> Top.

<: direction: pick Top -> Top (the left branch). ✓
:> direction: Top -> Top <: (Top -> Top) & ({X:E} -> Top). Need Top -> Top <: Top -> Top ✓ and Top -> Top <: {X:E} -> Top (contravariantly {X:E} <: Top ✓, covariantly Top <: Top ✓). ✓

Yes! It works. And note: A₂ = X -> Top is not substitution-invariant on its own (different E give different label types in the domain), but the equivalence with B still holds because the X is in an absorbed position (under Top contravariantly).

So we got lucky here — X in A₂ is in an absorbed position, and it doesn’t matter what E is.

Trying to break it: A₂ with active X

Let me construct a case where A₂ has X in a non-absorbed position, but B̂ <: Â₂ still holds.

If B̂ <: Â₂ holds and Â₂ has an active X on the right (e.g., covariant {X:C}), then B̂ must have a matching {X:?} on the left in the corresponding position — meaning B has X (as a variable) in that same position.

But by the substitution-invariance of B, that X in B must itself be in an absorbed-only position in B ‘s subtyping behavior. Contradiction? Not necessarily — B ‘s X could be in a position that’s absorbed when B is on the right (as in B̂ <: Â₁) but active when B is on the left (as in B̂ <: Â₂).

Hmm, but absorption is a property of how the variable is positioned in the type, not of which side of <: it’s on. If X is under a Top covariantly (like Top -> X), then in A <: Top -> X, the X is on the right covariantly — actively constraining A ‘s codomain. So position-relative-to-other-structure determines activity, not the side of <:.

Let me concretely try:

B = X -> Top, so B̂ = {X:D} -> Top.
A₁ = X -> Top, so Â₁ = {X:C} -> Top.

Â₁ ~ B̂: contravariant {X:D} <: {X:C} and {X:C} <: {X:D}, i.e., D <: C and C <: D, so C ~ D.

Damn, we get C ~ D immediately.

Try A₁ = (X -> Top) -> Top -> Top… I’ll just think structurally.

For Â₁ ~ B̂ without forcing C ~ D: any X -occurrence in A₁ must be in absorbed position, and any X -occurrence in B must be too. If B has X not in absorbed position, then matching it from A₁ ‘s side will force the labels to compare both ways, giving C ~ D.

So (*) (which is what the IH gives us when C ≁ D) forces B to have X in absorbed position only.

This is exactly the condition needed to mimic B̂ <: Â₂! Because if B ‘s X is only in absorbed positions, then in any derivation of B̂ <: T (for any T, including T = Â₂), the X -occurrences in B̂ (which all come from substituting B ‘s X) appear in absorbed positions too. Substituting different things for X doesn’t change this absorption pattern. So B[X -> {X:F}] <: A₂[X -> {X:E}] should follow for the right E ‘s.

But wait, does it follow for arbitrary E?

Let me re-examine. The derivation B̂ <: Â₂ proceeds by structural induction on the types. Each step matches a constructor or applies absorption (_ <: Top) or intersection elimination. When we vary the substitution for X in B (changing D to F), B̂ ‘s shape changes, but every X in B is in absorbed position, so the changes don’t matter — they just slot different label types into spots that get absorbed. ✓

When we vary the substitution in A₂ (changing C to E), Â₂ ‘s shape changes. If A₂ ‘s X is in active position, then Â₂ genuinely changes, and the derivation might break.

However, A₂ ‘s X -occurrences that match into B̂ ‘s X -occurrences require B to have a corresponding X in active position — which we’ve ruled out. So A₂ ‘s X -occurrences either:

Don’t match into B̂ ‘s X -occurrences (because B̂ ‘s X -positions are all absorbed and don’t reach A₂ ‘s X -positions structurally).
Are themselves in absorbed positions in the comparison B̂ <: Â₂.

In case (1), A₂ ‘s X -occurrence on the right of <: (covariant context) requires something on the left to match it. Since B ‘s X -position doesn’t match (it’s elsewhere in absorbed position), the match must come from… hmm, the only way B̂ <: T_X (where T_X contains {X:_} as label) is if B̂ ‘s structure provides the matching label. So if B̂ doesn’t have a label in that position, B̂ <: T_X fails. So case (1) is empty.

Therefore A₂ ‘s X -occurrences are all in absorbed positions in the comparison too. So changing C to E doesn’t affect derivability. ✓

OK so I think it actually works:

Provisional conclusion (Q1): Yes, the semantic induction goes through. The key insight is that (*) (the IH) forces B to have X in absorbed positions only, and by structural matching, A₂ ‘s X -occurrences in the derivation B̂ <: Â₂ must also be in absorbed positions, so the derivation lifts to arbitrary substitutions.

The argument has two main lemmas:

Lemma 1 (Substitution-invariance from full equivalence). If A[X -> {X:E}] ~ B[X -> {X:F}] for all E, F, then B[X -> {X:F}] ~ B[X -> {X:F'}] for all F, F'.

Proven above by transitivity through A.

Lemma 2 (Absorption from substitution-invariance). If B[X -> {X:F}] ~ B[X -> {X:F'}] for all F, F', then in any subtyping derivation B[X -> {X:F}] <: T for any T, the occurrences of {X:F} (substituted for B ‘s X) appear in absorbed positions.

This is the heart of it — it requires a careful induction on derivations. I believe it’s true but it needs a real proof.

Lemma 3 (Lifting to arbitrary substitutions). If B[X -> {X:D}] <: A₂[X -> {X:C}] and B ‘s X -occurrences are absorbed (Lemma 2), then for all E, F, B[X -> {X:F}] <: A₂[X -> {X:E}].

Provable by induction on the original derivation, replacing absorbed occurrences freely.

So I’m now optimistic that the semantic version of the lemma is provable.

Question 2: Avoidance set as a meta-theoretic property

If you want the avoidance set to describe the semantic property (not just be an algorithm), then you have an obligation: prove soundness and completeness of the syntactic avoidance system w.r.t. the semantic spec.

Soundness (A <: B \ S syntactically ⟹ ∀X∈S, ∀E,F. A[X→{X:E}] <: B[X→{X:F}]) should be straightforward.

Completeness (semantic ⟹ syntactic) is the hard direction. Given the difficulty you’ve encountered with the stuck case, completeness might actually fail for the avoidance system as currently formulated, even if the semantic version of the inversion lemma holds.

If completeness fails, you have a few options:

Strengthen the syntactic system (add rules) until completeness holds.
Live with incompleteness, and use the semantic spec for meta-theoretic statements while using the syntactic system algorithmically (treating syntactic as a sound-but-conservative approximation).
Reformulate the syntactic system entirely.

Given that the lemma is a meta-theoretic statement, I’d recommend stating it semantically first and then asking what syntactic system best captures it.

Question 3: Reflexivity

Right, your point is correct: reflexivity holds when the free variables of the type are disjoint from the avoidance set. This is a natural restriction — you can’t claim X <: X is “vacuous” if X is in the avoidance set, because that’s exactly the comparison we’re disallowing.

But notice: the semantic spec gives you reflexivity for most types, but with a subtle case. If T = Top -> X and X is in the avoidance set, then semantically T <: T \ {X} should hold because Top -> X[X→{X:E}] <: Top -> X[X→{X:F}] requires X[X→{X:E}] <: X[X→{X:F}], i.e., {X:E} <: {X:F}, i.e., E <: F, which is not true for arbitrary E, F. So semantically, Top -> X <: Top -> X \ {X} fails. ✓ This matches your syntactic intuition that the active occurrence of X blocks reflexivity.

But for T = X -> Top: semantically, T[X→{X:E}] <: T[X→{X:F}] is {X:E} -> Top <: {X:F} -> Top, requiring {X:F} <: {X:E}, i.e., F <: E. Again not true for arbitrary E, F. So semantically also fails.

For T = (Top -> X) -> Top: semantically (Top -> {X:E}) -> Top <: (Top -> {X:F}) -> Top requires (Top -> {X:F}) <: (Top -> {X:E}), i.e., {X:F} <: {X:E}, i.e., F <: E. Fails for arbitrary E, F.

For T = Top -> Top: trivially T <: T \ {X}. ✓

For T = (Top -> X) & (Top -> Top): semantically, ((Top -> {X:E}) & (Top -> Top)) <: ((Top -> {X:F}) & (Top -> Top)). The right is an intersection, needing both: ... <: Top -> {X:F} (pick the left branch on the left side: Top -> {X:E} <: Top -> {X:F} requires E <: F, fails). Hmm wait, on the left side we have an intersection too — so we can pick either branch. The Top -> Top branch: Top -> Top <: Top -> {X:F} requires Top <: {X:F}, fails. So we must use the Top -> {X:E} branch, which fails. So semantically reflexivity fails here.

So even for an intersection that has a “safe” branch, reflexivity fails if the unsafe branch matters on the right. This is an important constraint — it’s not enough that some branch is X -free, you need every required component to be X -free or otherwise absorbed.

This reinforces that the semantic spec is the right ground truth, and the syntactic system needs to be carefully aligned.

Question 4: The avoidance set is not equivalent to a context

Right, agreed. The avoidance set is more like a “no-go list” for variables to actively participate, while the variables remain in scope. This is genuinely different from removing them from the context. Specifically:

Removing from context: the variable X is no longer in scope at all; using it is a free-variable error.
Avoidance set: the variable X is still in scope; the type can mention it; you just can’t actively use it (no reflexivity), though you can have it absorbed by Top or dropped via intersection elimination.

The avoidance set is a “use-tracking” mechanism, perhaps closer to linear-type or modality-style annotations than to standard context manipulation. This might suggest looking at the semantic side through the lens of parametricity (in the Reynolds sense): the avoidance set marks variables that are “parametrically uniform” in the type’s behavior.

Summary and recommendation

The semantic version of the inversion lemma appears provable. I sketched the argument above, with three sub-lemmas, the key one being that (*) (the IH for A₁ and B) forces B ‘s X -occurrences to be in absorbed positions, which then lifts the one-directional B̂ <: Â₂ to arbitrary substitutions.
Whether the syntactic avoidance system completely captures this is a separate, harder question — and your stuck case suggests it might not, at least not without strengthening the syntactic rules.
Recommended strategy:
- Prove the semantic lemma first as ground truth.
  - Then test whether the syntactic system is complete w.r.t. the semantic spec. The check on the absorption/dropping rules I mentioned earlier should be a useful guide.
  - If complete, you have a beautiful syntactic characterization. If not, you’ve identified missing rules (or accepted incompleteness as a controlled approximation).
Caveats and things I’m unsure about:
- Lemma 2 (absorption from substitution-invariance) needs a careful proof, and I’ve waved at it. It might fail in edge cases I haven’t considered, particularly with complex intersection structures.
  - I assumed the semantic lemma’s induction on type size goes through cleanly, but there could be other stuck cases beyond the one we’ve been discussing (e.g., when B is also an intersection, even though you’ve deferred that). The shape of the inductive argument should generalize, but cross-cases might bring new complications.

Would you like me to attempt a more rigorous write-up of Lemma 2, or work through what additional rules the syntactic avoidance system might need to be complete?

You

I agree with you that we should look at the semantic specification first to see if there are problems, and then work on the syntactic rule, working out what is the correct version to characterize that semantic version. So let’s first focus on your answer to the first point, because that’s the semantic property I want to prove for inversion substitution with nominal labels.

Lemma 1 (Substitution-invariance from full equivalence). If A[X -> {X:E}] ~ B[X -> {X:F}] for all E, F, then B[X -> {X:F}] ~ B[X -> {X:F'}] for all F, F'.

I agree that this property is probably something we need, but I’m a bit suspicious about your optimism in this lemma. In particular, I think this lemma won’t hold if B is an intersection? I’m not sure. Wouldn’t the example like: Top → Top ~ (X → Top) & (Top → Top) be affected?

Adding an ordinary constraint wouldn’t be the workaround for this case if this example, or something that shares structures with this example, will break this property because…we could write Int → (top → Top) ~ Int → ((X → Top) & (Top → Top)) to circumstance, the intersection, ordinary constraint. Can you work a bit harder on this? I think for the example I present, even if B is the intersection, it doesn’t really matter because X is always absorbed by Top. I just want to know if that’s always the case, and would the B break if it is an intersection that could contain X, where on two sides they are absorbed or substituted with different F and F’?

Claude

Re-examining Lemma 1

You’re absolutely right to push back. Let me look at this carefully.

The example you raise

Take A = Top -> Top, B = (X -> Top) & (Top -> Top). Then for any E, F:

A[X -> {X:E}] = Top -> Top
B[X -> {X:F}] = ({X:F} -> Top) & (Top -> Top)

Is A[X -> {X:E}] ~ B[X -> {X:F}]?

<: direction: Top -> Top <: ({X:F} -> Top) & (Top -> Top) requires Top -> Top <: {X:F} -> Top (✓: contravariant {X:F} <: Top, covariant Top <: Top) and Top -> Top <: Top -> Top ✓.
:> direction: ({X:F} -> Top) & (Top -> Top) <: Top -> Top. Pick the right branch: Top -> Top <: Top -> Top ✓.

Yes, equivalent for all E, F. So the premise of Lemma 1 is satisfied.

Now, is B[X -> {X:F}] ~ B[X -> {X:F'}]?

({X:F} -> Top) & (Top -> Top) ~ ({X:F'} -> Top) & (Top -> Top)
Both sides equivalent to Top -> Top (by the same reasoning above). So yes, they’re equivalent — by transitivity through A. ✓

OK in this case Lemma 1 still holds. But your point is a good one: the equivalence holds because the X in B is in absorbed position, but Lemma 1’s statement doesn’t immediately reveal this, and we should worry about cases where B ‘s structure forces the equivalence to depend on F.

The general question

Lemma 1’s proof I gave was pure transitivity:

B[X -> {X:F}] ~ A[X -> {X:E}] ~ B[X -> {X:F'}]

using the premise twice with the same E (any choice) but different F ‘s. That proof is valid regardless of whether B is an intersection — it only uses the premise and transitivity of ~.

So Lemma 1 itself, as a purely formal consequence of the premise, holds. The interesting question is: does the premise itself have absurd consequences when B is an intersection containing active X?

Let me check by trying to construct a B with active X such that the premise still holds.

Attempt to break it

I want B an intersection where some branch has active X, but A[X→{X:E}] ~ B[X→{X:F}] holds for all E, F.

If a branch Bᵢ of B has active X, then B̂ <: Bᵢ̂, so Â <: Bᵢ̂ (active X on the right). For Â <: Bᵢ[X→{X:F}] to hold for all F, A must structurally match Bᵢ and provide an X -bearing label in the matching position — but the substituent on A ‘s side is {X:E} (free choice), and on B ‘s side is {X:F} (free choice independently). For the inequality to hold for all E, F, we’d need {X:E} <: {X:F} for all E, F, which forces a universal E <: F — false.

So if B ‘s branch has active X, Â <: B̂ cannot hold uniformly. This means the premise rules out active X in any branch of B that is “required” by some right-hand-side inclusion.

But — and this is the subtle part — B being an intersection means Â <: B̂ is satisfied by satisfying both Â <: B₁̂ and Â <: B₂̂. So every branch of B must be reachable from A in the <: direction. Each branch Bᵢ̂ must be a supertype of Â. So each branch’s active X -occurrences (covariantly) require matching X in A, which we’ve argued can’t hold uniformly.

Hmm wait — what about contravariant X in Bᵢ? Like Bᵢ = X -> Top. Then Â <: Bᵢ̂ = {X:F} -> Top requires (covariantly) Top <: Top ✓ and (contravariantly, “absorbed”) {X:F} <: Top ✓ — no constraint on F. So contravariant X under Top is absorbed and the premise holds for all F.

So X can appear in B actively-by-position in branches, as long as all such occurrences are absorbed by surrounding structure (Top, intersection elimination). This matches your example.

Can X appear in B in a position that is not absorbed?

Try B = (X -> X) & (Top -> Top). Then B[X→{X:F}] = ({X:F} -> {X:F}) & (Top -> Top). The left branch has X covariantly in the codomain — not absorbed.

For Â <: ({X:F} -> {X:F}) (taking branch 1), we need A to match an arrow whose codomain is {X:E} (or contains {X:E} actively), and we need {X:E} <: {X:F} for all E, F. Fails. So Â <: B[X→{X:F}] fails for the branch-1 inclusion when we require uniformity.

But wait — does Â <: B[X→{X:F}] require hitting branch 1? Yes, because B[X→{X:F}] is an intersection, and Â <: B₁ & B₂ requires both.

So if B has any branch with active non-absorbed X, the premise Â ~ B̂ for all E, F fails. Therefore the premise forces all X -occurrences in B ‘s branches to be in absorbed positions, and Lemma 1’s conclusion (substitution-invariance of B) follows by the absorption being independent of F.

So Lemma 1 holds, even when B is an intersection — the premise is so strong that it constrains B substantially.

The deeper concern

But there’s a subtler worry I want to address. The IH gives us (*) only for the ordinary case (A₁, B) in the stuck case I was working through. So when I invoke Lemma 1 on (*), I’m getting substitution-invariance for B, which is fine.

But the overall induction on the original lemma (SemInv) — when we recurse into (A₁, B) — we want to apply IH to (A₁, B) where A₁ may be an intersection (e.g., A = (A₁₁ & A₁₂) & A₂). That’s not the issue you raised, but it’s worth noting that IH applies because it’s stated for general A, B.

The stuck case I focused on had B ordinary. Lemma 1 applied to that case gives B ‘s substitution-invariance, where B is ordinary. Good.

But the outer B in the lemma statement could be an intersection too — which is the case you said you’d defer. When B is an intersection, decompose: B = B₁ & B₂. Then Â ~ B̂ gives Â <: B̂ and B̂ <: Â. From Â <: B̂ = B₁̂ & B₂̂, both Â <: B₁̂ and Â <: B₂̂. From B̂ <: Â, either B₁̂ <: Â or B₂̂ <: Â (if Â is ordinary) — but Â could itself be an intersection, in which case we have similar branching.

In the symmetric stuck-style case where, say, Â <: B₁̂ is full mutual but Â <: B₂̂ is one-directional — same structure as before, with roles potentially swapped. The argument should symmetrize, but you’re right that there are more cases.

Refining the worry: what if the absorbing structure is shared across branches?

Here’s a sharper worry that I think is what you’re pointing at. Can B ‘s X -occurrence be absorbed by structure in another branch via intersection elimination on the right-hand side of a <:?

Example: B = (X -> Top) & (Top -> Top). When Â <: B̂, we hit both branches: Â <: X -> Top \[F\] (i.e., Â <: {X:F} -> Top) and Â <: Top -> Top. The first holds because {X:F} is on the contravariant side, absorbed by Top. So yes, the first branch’s X is absorbed.

When B̂ <: Â, we use intersection elimination on the left: pick one branch of B̂ to subtype Â. Pick Top -> Top (the second branch), which doesn’t have X. So B̂ <: Â reduces to Top -> Top <: Â, no X in play.

So in this scenario, the first branch of B has X (in absorbed position relative to its own structure) but never participates in the B̂ <: Â direction (it’s dropped via intersection elimination on the left). The X in B ‘s first branch is absorbed both by being under Top covariantly and by being eliminated in the relevant direction.

Substitution-invariance still holds here because changing F to F' only affects the first branch’s X, which is in absorbed position both ways. ✓

The genuinely worrying case

What if B ‘s branch has X in an absorbed position only by virtue of intersection-elimination, not by Top -absorption? E.g., the X could be in an “active-looking” position in its own branch, but the branch is dropped.

Example: B = (X -> X) & (Top -> Top). The first branch has X covariantly in the codomain — not absorbed by Top. So is the premise satisfied?

B[X→{X:F}] = ({X:F} -> {X:F}) & (Top -> Top).

For Â ~ B̂ for all E, F: in particular, Â <: ({X:F} -> {X:F}) & (Top -> Top) requires Â <: {X:F} -> {X:F}. A must structurally match (be ≤ an arrow, or an intersection containing one), and the codomain match requires something <: {X:F} for arbitrary F. The only thing <: {X:F} for all F is… well, only types of the form {X:G} with G <: F for all F — impossible for non- Top F. So this fails.

Hence the premise forces the first branch’s X to be absorbed (or for that branch to never be required on the right). Since intersection on the right requires all branches, the only way to “avoid” the active- X branch on the right is to not have it — i.e., the structure of B forces all branches to be X -absorbed on the covariant-active side.

So if B = (X -> X) & (Top -> Top), the premise fails, ruling this case out. Good — the premise really is strong enough to rule out non-absorbed X in B.

Verdict on Lemma 1

I think Lemma 1 holds. The proof I gave (pure transitivity) is valid regardless of B ‘s shape. Your worry was that the premise might be unstable when B is an intersection with active X, but I’ve argued that the premise itself forbids active non-absorbed X in B ‘s branches — because:

Intersection on the right (in Â <: B̂) requires every branch of B̂ to be a supertype of Â.
If a branch has covariant non-absorbed X, requiring Â <: branch[X→{X:F}] for all F forces a uniform constraint on A -side substituents that can’t hold (unless Â doesn’t have a corresponding active X, in which case the structural match fails).
So all branches’ X -occurrences must be absorbed.

And Lemma 1’s conclusion follows.

However — a remaining concern

Your example Top -> Top ~ (X -> Top) & (Top -> Top) works but is fragile in one way: the absorption of X in B ‘s first branch happens position-internally (X is contravariant under Top). What if absorption requires a non-local argument, like intersection elimination?

Specifically: can X appear in B ‘s branch in a position that’s “active” in that branch alone, but the branch is dropped during B̂ <: Â?

Example: B = X & Top where Top is the second branch. Then B[X→{X:F}] = {X:F} & Top.

Â <: {X:F} & Top for all E, F: requires Â <: {X:F} for all F (the first branch is required). Need A to provide a label {X:E} with {X:E} <: {X:F} for all E, F. Fails. So this premise is unsatisfiable for non-trivial Â.

What if B = (X -> Top) & U where U is some non- X type?

Â <: (X -> Top)[X→{X:F}] requires Â <: {X:F} -> Top. Contravariantly {X:F} <: ? for ? from Â ‘s domain — we need {X:F} <: A's-domain[X→{X:E}]. If A ‘s domain is Top, then {X:F} <: Top ✓ for all F. ✓

So X in B ‘s branch can be absorbed by Top in A ‘s domain (contravariantly). This is your example. Fine.

What if B = (Top -> X) & (Top -> Top) (covariant X in branch 1, not absorbed within the branch)?

Â <: Top -> {X:F} requires covariantly ? <: {X:F} where ? is A ‘s codomain (under appropriate substitution). For all F: A ‘s codomain is Top? Then Top <: {X:F} fails. A ‘s codomain is {X:E}? Then {X:E} <: {X:F} requires E <: F for all E, F. Fails. So the premise fails.

Good — confirms the picture.

So I retract some worry but not all

Lemma 1 holds. The premise constrains B heavily (and A symmetrically), forcing absorption of all active X -occurrences. Where my earlier optimism was justified: the chain of lemmas does go through for the case B ordinary.

Where I should be more careful, though, and which your question prompts: when B is itself an intersection, the original lemma (SemInv) has additional cases where the IH structure changes. The semantic version’s induction has cases for A and B both ranging over intersection-or-ordinary, not just A an intersection with B ordinary. For the case B = B₁ & B₂ with Â ordinary — the symmetric “stuck case” — you’d recurse on (A, B₁) and want to lift Â <: B₂̂ (one direction). The argument should symmetrize, but it’s worth checking explicitly.

A residual worry I haven’t fully resolved

There’s one case I want to flag honestly. In the original stuck case, my argument went:

IH on (A₁, B) gives (*).
From (*), by Lemma 1, B is substitution-invariant.
From B ‘s substitution-invariance, B ‘s X -occurrences are absorbed (Lemma 2).
Therefore B̂ <: Â₂ lifts to arbitrary substitutions.

Step 3 → 4 implicitly assumes that absorption of X in B propagates through any derivation B̂ <: T, including B̂ <: Â₂. I argued informally that A₂ ‘s X -occurrences must match into B ‘s X -occurrences (which are absorbed) or be in absorbed positions themselves. But a fully rigorous proof of Lemma 2’s consequence — that derivations B̂ <: T lift to arbitrary F -substitutions — needs to be done by careful induction on the derivation, and I haven’t fully checked all cases (e.g., what happens when T is itself an intersection with multiple branches having different absorption structures).

So I want to be honest: I believe Lemma 1 is true and the overall argument is on the right track, but Lemma 2 (and its use in step 4) is the technically delicate part where unforeseen cases might surface. The case you raised gave me a chance to check Lemma 1 more carefully, and it survives. But I’d want to write out Lemma 2 in detail before claiming the full proof.

Would you like me to attempt that next, focusing specifically on the inductive structure of “absorption propagation” and where it might or might not go through?

You

I see your point that Lemma 1 will be proved by transitivity, which is great. I don’t think Lemma 2 and Lemma 3 that you propose are necessary for my proof framework. We can try to summarize what we have so far: 1. We are suggesting a statement and proof for the Nominal Inversion Lemma on a semantic basis. 2. The problem I’m having is that I want to track multiple variables to be semantically avoided instead of just X. Do you think Lemma 2 can be extended to prove that?

Claude

Generalizing to Multiple Avoided Variables

This is a great question, and let me think carefully before giving an answer, because the multi-variable generalization is where things tend to subtly break.

Setting up the multi-variable statement

The single-variable semantic spec was:

A <: B \ {X} semantically iff ∀E, F. A[X -> {X:E}] <: B[X -> {X:F}].

The multi-variable generalization should be:

A <: B \ S semantically iff ∀(Eₓ)ₓ∈S, (Fₓ)ₓ∈S. A[θ_E] <: B[θ_F], where θ_E = [X -> {X:Eₓ}]ₓ∈S and θ_F = [X -> {X:Fₓ}]ₓ∈S are simultaneous substitutions, with the substituents on each side independent (separate E and F families).

The corresponding multi-variable inversion lemma would be:

(SemInv-Multi) If A[θ_C] ~ B[θ_D] for specific families (Cₓ), (Dₓ), then either some equivalence among bodies holds (TBD how to phrase the disjunctive branch in the multi-variable case), or A ~ B \ S semantically.

I’ll come back to the disjunctive branch — phrasing it correctly is part of the question.

Does Lemma 1 generalize?

Yes, straightforwardly. If A[θ_E] ~ B[θ_F] for all θ_E, θ_F ranging over the avoidance set S, then by transitivity:

B[θ_F] ~ A[θ_E] ~ B[θ_F']

for any fixed θ_E, giving B[θ_F] ~ B[θ_F'] for all θ_F, θ_F'. ✓

So substitution-invariance of B holds with respect to all variables in S simultaneously.

What I want to call your attention to: the disjunctive branch is more complex

In the single-variable case, the lemma says: “either C ~ D, or A ~ B \ {X}.”

In the multi-variable case, we have a family (Cₓ), (Dₓ) indexed by S. What’s the analog of “either C ~ D “?

The naive generalization is: “either Cₓ ~ Dₓ for all X ∈ S, or A ~ B \ S.” But this is too coarse — and it’s where I think the multi-variable case genuinely differs from the single-variable case.

Why it’s too coarse: Consider two variables X, Y and a type A where X appears actively but Y does not. Then the IH for X should fire (” Cₓ ~ Dₓ ”), but Y should be in the avoidance set (” A ~ B \ {Y} ”). We need a partition of S into those that are forced equivalent and those that are avoided.

So the right generalization is more like:

(SemInv-Multi) If A[θ_C] ~ B[θ_D], then there exists a partition S = S_eq ⊔ S_avoid such that:

For all X ∈ S_eq, Cₓ ~ Dₓ.

A ~ B \ S_avoid semantically (with avoidance over S_avoid).

Which X ‘s land in S_eq vs S_avoid is determined by whether X “actively participates” in the equivalence.

Does the inductive argument go through?

Now I need to think about whether the argument I gave for the single-variable case extends. The hard subgoal was: in the stuck case, lift B̂ <: Â₂ (one specific direction) to arbitrary substitutions.

In the multi-variable case, the IH on (A₁, B) gives a partition S = S_eq^(1) ⊔ S_avoid^(1) such that the bodies are forced equivalent for X ∈ S_eq^(1) and A₁ ~ B \ S_avoid^(1) semantically.

For the avoided variables S_avoid^(1), the substitution-invariance argument (Lemma 1 generalized) goes through: B is substitution-invariant in S_avoid^(1).

Now I need to lift B̂ <: Â₂ to arbitrary substitutions over S_avoid^(1). The structural argument should be:

B ‘s occurrences of variables in S_avoid^(1) are in absorbed positions (by the substitution-invariance argument, which generalizes pointwise to each X ∈ S_avoid^(1)).
Therefore in any derivation B̂ <: T, those variables’ substitutents are in absorbed positions.
So we can vary the substitutents over S_avoid^(1) freely.

For variables in S_eq^(1), we have Cₓ ~ Dₓ already, and they don’t need to be lifted — they’re “fixed” in the comparison.

Then A ~ B \ S_avoid^(1) should follow.

But wait — there’s a subtle issue. The IH on (A₁, B) gives us partition S_eq^(1). But when we conclude A ~ B \ S_avoid for the outer lemma, what is S_avoid? It needs to be a partition of S that works for all of A, not just A₁.

Concretely: a variable Y might be absorbed/inactive in A₁ (so Y ∈ S_avoid^(1)) but active in A₂ (forcing Cᵧ ~ Dᵧ from the A₂ side). Then Y should be in S_eq for the outer partition, not S_avoid.

So we need:

S_eq = S_eq^(1) ∪ {variables forced equivalent by A₂'s structure}
S_avoid = S \ S_eq

For the A₂ side, we have B̂ <: Â₂ (one direction), not full mutual subtyping. So we can’t just apply the IH. We need a separate analysis.

The new wrinkle in multi-variable

Here’s where the multi-variable case is genuinely harder than single-variable.

In single-variable: when we lift B̂ <: Â₂ to arbitrary substitutions, we vary X freely — that’s the whole game.

In multi-variable: we want to vary X ∈ S_avoid freely, but keep X ∈ S_eq fixed (or vary them only consistently with Cₓ ~ Dₓ). This means the lifting argument needs to be “selective” — it lifts only over a subset of variables.

For S_avoid^(1) variables, the substitution-invariance comes from B ‘s substitution-invariance, which we established.

For variables active in A₂ but absorbed in A₁: these are in S_eq^(1) (because they’re absorbed in A₁, so the IH for (A₁, B) would assign them to avoidance, hence inactive). Wait, no — let me re-examine.

The IH on (A₁, B) partitions S based on activity in A₁ and B. A variable Y active in A₂ only doesn’t show up in the (A₁, B) IH at all — because A₁ doesn’t constrain Y (it’s absorbed there), and the equivalence Â₁ ~ B̂ doesn’t force Cᵧ ~ Dᵧ.

So the IH puts Y ∈ S_avoid^(1). But Y is active in A₂. In B̂ <: Â₂, Y actively participates (covariantly in Â₂, requiring matching in B̂).

Wait — but if Y is in S_avoid^(1), then B ‘s Y -occurrences are in absorbed positions (by substitution-invariance of B over S_avoid^(1)). So B̂ doesn’t have Y actively. Then how does B̂ <: Â₂ succeed when Â₂ has active Y?

It can only succeed if Â₂ ‘s Y is also in absorbed position in this comparison (e.g., subsumed by Top on B̂ ‘s side, or its branch is dropped). But then Y isn’t really “active” in the comparison either — only “active in A₂ ‘s syntax” but not in the actual derivation B̂ <: Â₂.

So the picture is: variables active in the syntax of A₂ but absorbed in the comparison B̂ <: Â₂ still go to S_avoid and we can lift them freely.

Variables that are active in the comparison B̂ <: Â₂ require B̂ to provide matching Y -bearing labels — meaning Y is active in B̂ — meaning Y ∈ S_eq^(1) (because B ‘s Y -occurrences would then not be in absorbed-only positions, so substitution-invariance for Y would fail, so Y wouldn’t have been in S_avoid^(1)).

Wait, but Y ∈ S_eq^(1) means Cᵧ ~ Dᵧ from the (A₁, B) IH. But Y is supposed to be active in A₂, not necessarily A₁. How would the (A₁, B) IH conclude Cᵧ ~ Dᵧ?

Hmm, the answer must be: if Y is active in B ‘s structure (matching A₂ ‘s active Y), then Y is also matched somewhere in A₁ ~ B — because B has Y in its structure regardless of which A it’s compared to. So A₁ ~ B̂ would force Cᵧ ~ Dᵧ via that matching position.

This seems to work out: variables active in B ‘s structure are in S_eq^(1) (the equivalence is forced by the matching with A₁). Variables not active in B ‘s structure are in S_avoid^(1), and any apparent activity in A₂ is actually absorbed in the comparison.

But — this analysis assumes a clean dichotomy: either Y is active everywhere in B ‘s structure or nowhere. With intersection types, the picture is more complex: Y might be active in one branch of B and absorbed in another.

Re-examining Lemma 2 in the multi-variable case

You said Lemma 2 (absorption from substitution-invariance) might not be necessary for your proof framework. Let me push back: I think for the multi-variable case it becomes more important, because the analysis above relies on knowing precisely which variables are in absorbed positions in B ‘s structure.

In the single-variable case, you can prove the lemma by a clever argument that uses substitution-invariance of B to propagate through derivations directly, without needing an explicit absorption characterization. Maybe that’s what you have in mind when you say Lemma 2 isn’t needed.

In the multi-variable case, we have a family of substitution-invariances (one per X ∈ S_avoid^(1)), and we want to combine them. Combining substitution-invariances coherently — especially when different X ‘s have different absorption patterns — might require a more explicit characterization.

However, here’s a positive observation: substitution-invariance over a set of variables is a strictly stronger property than over each variable individually. If B[θ_F] ~ B[θ_F'] for all θ_F, θ_F' ranging over S_avoid^(1), this includes mixing values for different variables. So we’re not losing anything by going to multi-variable — we just have a stronger invariance.

The lifting of B̂ <: Â₂ to arbitrary θ_F over S_avoid^(1) should follow from this stronger invariance plus a structural argument analogous to the single-variable case.

Where I think the genuine complication arises

Let me try to identify the specific place where multi-variable might genuinely differ from single-variable.

Scenario: Two variables X, Y, both in the avoidance candidate set. X is active in A₂ (absorbed in A₁). Y is active in A₁ (absorbed in A₂). B has both X and Y structurally, both in absorbed positions.

By the IH on (A₁, B): Y is forced equivalent (Cᵧ ~ Dᵧ) because it’s active in A₁ ‘s comparison… wait, but B ‘s Y is in absorbed position, so the comparison Â₁ ~ B̂ shouldn’t force Cᵧ ~ Dᵧ. Hmm, unless A₁ ‘s active Y matches against B ‘s structure such that…

Let me think again. If Y is active in A₁ (covariantly say) and absorbed in B, then Â₁ <: B̂ — the Y in Â₁ covariantly contributes to Â₁ ‘s “size”, but it’s compared to whatever is in B̂ ‘s corresponding position. If B̂ ‘s position is Top (absorbing), then {X:Eᵧ} <: Top ✓ trivially. The other direction B̂ <: Â₁ requires Top (or whatever B̂ has) <: {X:Eᵧ} — which fails unless Y is absorbed in A₁ too.

So if Y is active in A₁ and absorbed in B, the equivalence Â₁ ~ B̂ requires Y also be absorbed in A₁ (because B̂ <: Â₁ would fail otherwise). Contradiction with our setup.

So active-in- A₁ -and-absorbed-in- B is incompatible with Â₁ ~ B̂. This rules out the scenario.

What about Y active in both A₁ and B, in matching positions? Then Â₁ ~ B̂ forces Cᵧ ~ Dᵧ via the label comparison. So Y ∈ S_eq^(1). Then the IH puts Y in the equivalence-forced bucket. ✓

What about Y active in B but absorbed in A₁? Then B̂ <: Â₁ — B ‘s active Y (say covariantly) needs Â₁ to provide a covariant slot, but A₁ has Y absorbed — so A₁ provides Top or similar in that position, making B̂ <: Â₁ work without forcing Cᵧ - Dᵧ comparison. The reverse direction Â₁ <: B̂: A₁ ‘s Top -position vs B̂ ‘s {X:Dᵧ}: requires Top <: {X:Dᵧ}, which fails. So this direction fails — Â₁ ~ B̂ fails. Contradiction.

So Y active in B requires Y active in A₁ (in matching positions), and equivalence forces Cᵧ ~ Dᵧ.

Cleaner statement: The activity patterns of Y in A₁ and B must match for Â₁ ~ B̂ to hold. So:

Y active in matching positions in both → Cᵧ ~ Dᵧ (in S_eq^(1))
Y absorbed in both → in S_avoid^(1)
Mismatched → equivalence fails, situation impossible.

This is reassuring. In the multi-variable case, the IH on (A₁, B) cleanly partitions S by the “common activity” of Y in both A₁ and B.

So does the multi-variable lemma go through?

I believe the answer is yes, with care. Let me sketch:

Theorem (SemInv-Multi). If A[θ_C] ~ B[θ_D], there exist disjoint subsets S_eq, S_avoid ⊆ S with S_eq ∪ S_avoid = S, such that:

Cᵧ ~ Dᵧ for all Y ∈ S_eq.

A ~ B \ S_avoid semantically.

Proof sketch by induction on |A| + |B|:

Base cases (A, B both ordinary, possibly variables, labels, Top, arrows): direct analysis. Define S_eq as variables forced into label-comparisons, S_avoid as the rest.
Intersection cases: in the A = A₁ & A₂, B ordinary stuck case, apply IH to (A₁, B) getting partition (S_eq^(1), S_avoid^(1)). The partition for the outer lemma is… I want to say (S_eq^(1), S_avoid^(1)) directly, because:
- S_avoid^(1) variables are absorbed in B, so by structural argument, also absorbed in B̂ <: Â₂. Lift to arbitrary substitutents for these.
  - S_eq^(1) variables: equivalence already established by IH.

Wait, but I need to also check that A₂ ‘s structure is compatible — specifically, that A₂[θ_C] <: B[θ_D] could fail for variables in S_avoid^(1) (we don’t need it; we only need the :> direction lifted), and that the :> direction lifts correctly.

Actually hold on — in the single-variable stuck case I was worried about whether B̂ <: Â₂ lifts. The argument I gave used B ‘s substitution-invariance. In multi-variable, we have B ‘s substitution-invariance over S_avoid^(1), and we want to lift B̂ <: Â₂ to vary substitutents over S_avoid^(1) only (keeping S_eq^(1) fixed). This should follow from the same structural argument.

So I believe the multi-variable lemma holds, with essentially the same proof structure as single-variable, with the partition (S_eq, S_avoid) doing the bookkeeping.

Summary

To answer your direct question:

Do you think Lemma 2 can be extended to prove that?

I think the multi-variable generalization works, and the core argument doesn’t fundamentally change — it just needs careful bookkeeping with the (S_eq, S_avoid) partition. The substitution-invariance of B (Lemma 1 generalized) over S_avoid gives us enough to lift one-directional comparisons over those variables freely, and the IH separately handles the S_eq variables by forcing the body equivalences.

Key design decision for your proof framework: the lemma’s statement should explicitly produce a partition, not just a binary disjunction. This is the multi-variable analog of “either C ~ D or avoidance” — now phrased as “for some partition S = S_eq ⊔ S_avoid, the equivalences hold for S_eq and avoidance holds for S_avoid.”

Genuine uncertainty I want to flag

I want to be honest about something. The argument above relies on the claim that the activity pattern of any variable Y is “matched” in A₁ and B when Â₁ ~ B̂ holds. I argued this informally by case analysis, but a rigorous proof would require carefully defining “activity pattern” for types with intersections and showing that mutual subtyping forces matching patterns.

This is a genuinely structural property of the subtyping relation, and proving it rigorously is non-trivial. In a sense, it’s the multi-variable analog of “Lemma 2” that you said wasn’t necessary in the single-variable case. In the multi-variable case, I think you can’t avoid making this structural property explicit, because it’s what underwrites the partition.

So my honest answer is: yes, the multi-variable generalization should hold, but unlike the single-variable case where you might be able to get away with implicit structural arguments, the multi-variable case will likely require an explicit structural lemma about activity-pattern matching under mutual subtyping. It’s worth thinking about whether such a lemma is feasible to state and prove in your framework, or whether it suggests reformulating the inversion lemma differently.

A few questions back to you:

In your proof framework, how do you usually handle the partition-style disjunction? Is there a clean way to thread it through the induction, or does it complicate the IH significantly?
Do you have a sense of whether the “activity matching” argument is something you’ve already developed for other purposes (e.g., in proving other meta-theoretic properties of your subtyping relation)? If so, it might be reusable here.
When you said Lemma 2 isn’t necessary, did you have a specific alternative argument in mind for the single-variable case? Knowing that might tell us whether the alternative also extends to multi-variable.

You

Ah then the lemma would be A [X → {X:C}] ~ B [X → {X: D}] / S then either C ~ D or A ~ B / S + {X}

where A ~ B / S is the semantic definition

This wuold be something I want to prove

Claude

The Right Statement of the Lemma

Ah, this is a much cleaner formulation. Let me make sure I understand it and then think about whether the proof goes through.

Restating the lemma

(SemInv-Multi). If A[X -> {X:C}] ~ B[X -> {X:D}] \ S (semantic equivalence under avoidance set S), then either:

C ~ D, or

A ~ B \ S ∪ {X} (semantic equivalence under extended avoidance set).

Where the semantic definition of A <: B \ S is:

∀(Eᵧ)ᵧ∈S, (Fᵧ)ᵧ∈S. A[θ_E] <: B[θ_F]

with θ_E, θ_F being independent substitutions over S.

This is much nicer than the partition formulation I was suggesting, because:

The induction is on the structure of A, B and tracks one “fresh” variable X at a time.
The avoidance set S accumulates across recursive calls.
The disjunction stays binary at each step (either body equivalence forced, or X joins the set).

When you recurse, you can apply the lemma multiple times (once per variable), and the cumulative avoidance set grows. Excellent.

The structure of the proof

The premise unfolds: A[X -> {X:C}] ~ B[X -> {X:D}] \ S means

∀θ_E, θ_F over S. (A[X -> {X:C}])[θ_E] ~ (B[X -> {X:D}])[θ_F]

Since X ∉ S (presumably — X is the new variable to consider), the substitutions θ_E, θ_F don’t touch X. So this is:

∀θ_E, θ_F over S. A[θ_E, X -> {X:C}] ~ B[θ_F, X -> {X:D}]

The conclusion’s second branch unfolds:

∀θ_E', θ_F' over S ∪ {X}. A[θ_E'] ~ B[θ_F']

i.e., for arbitrary independent substitutions over S ∪ {X}, equivalence holds.

So we’re trying to lift the equivalence from “specific (C, D) for X, arbitrary over S ” to “arbitrary over S ∪ {X}.”

The proof attempt

By induction on |A| + |B|. I’ll focus on the previously stuck case to see if the new formulation handles it cleanly.

Stuck case: A = A₁ & A₂, B ordinary, Â₁ ~ B̂ \ S (full mutual avoidance-equivalence), B̂ <: Â₂ \ S (one-directional only), Â₂ <: B̂ \ S fails.

Where I’m using Â = A[X -> {X:C}], B̂ = B[X -> {X:D}], and \ S indicates the avoidance-semantic version.

Apply IH to (A₁, B): either C ~ D (done, branch 1) or A₁ ~ B \ S ∪ {X} semantically.

Take the second case: A₁ ~ B \ S ∪ {X} semantically, i.e., for arbitrary θ_E', θ_F' over S ∪ {X},

A₁[θ_E'] ~ B[θ_F']. (*)

We want to conclude: for arbitrary θ_E', θ_F' over S ∪ {X},

(A₁ & A₂)[θ_E'] ~ B[θ_F'].

The <: direction: pick the A₁ branch, apply (*). ✓

The :> direction: need B[θ_F'] <: (A₁ & A₂)[θ_E'], i.e., B[θ_F'] <: A₁[θ_E'] (have, by (*)) and B[θ_F'] <: A₂[θ_E'].

We need to lift B̂ <: Â₂ \ S (which says ∀θ_E, θ_F over S. B[θ_F, X -> {X:D}] <: A₂[θ_E, X -> {X:C}]) to varying over S ∪ {X}.

Where the proof needs Lemma-1-style transitivity

Here’s the move. From (*) (with S ∪ {X} as the avoidance set), instantiate with any θ_E' and any θ_F':

A₁[θ_E'] ~ B[θ_F'].

By transitivity (the multi-variable Lemma 1), B ‘s equivalence class is invariant under the substituents over S ∪ {X} — i.e., for any θ_F', θ_F'':

B[θ_F'] ~ B[θ_F'']. (†)

In particular, for any θ_F' over S ∪ {X}, B[θ_F'] ~ B[θ_F, X -> {X:D}] for any choice of θ_F over S and the fixed D.

Now, B̂ <: Â₂ \ S says: for arbitrary θ_E, θ_F over S,

B[θ_F, X -> {X:D}] <: A₂[θ_E, X -> {X:C}].

By (†), the LHS is equivalent to B[θ_F'] for any θ_F' over S ∪ {X}. So:

B[θ_F'] <: A₂[θ_E, X -> {X:C}] for any θ_F' over S ∪ {X} and θ_E over S.

But the RHS still has {X:C} fixed, not arbitrary {X:E_X}. We’ve only varied θ_F' over S ∪ {X} and θ_E over S — X on A₂ ‘s side is still pinned to C.

The remaining gap

So the same issue I worried about before persists: we can vary B ‘s side freely (because B ‘s substitution-invariance includes X), but A₂ ‘s side has X pinned to C. We need to lift A₂ ‘s side to arbitrary E_X.

Here’s where I think the proof actually goes through, using a symmetric application of Lemma 1.

(*) is symmetric: A₁ is also substitution-invariant over S ∪ {X}. So for any θ_E', θ_E'' over S ∪ {X}:

A₁[θ_E'] ~ A₁[θ_E'']. (‡)

In particular, A₁[θ_E', X -> {X:C}] (where θ_E' ranges over S and X is set to C) is equivalent to A₁[θ_E'', X -> {X:E_X}] for any choice of E_X.

Now combine: from Â₁ ~ B̂ \ S, we have A₁[θ_E, X -> {X:C}] ~ B[θ_F, X -> {X:D}]. Taking transitivity through A₁:

A₁[θ_E', X -> {X:E_X}] ~ A₁[θ_E, X -> {X:C}] ~ B[θ_F, X -> {X:D}].

So A₁[θ_E', X -> {X:E_X}] ~ B[θ_F, X -> {X:D}]. Using (†) again on B ‘s side, we get equivalence with any B[θ_F', X -> {X:F_X}]. So:

A₁[θ_E', X -> {X:E_X}] ~ B[θ_F', X -> {X:F_X}], which is just (*) restated. Fine — consistent.

Now back to lifting B̂ <: Â₂ \ S. Combine with the above: for arbitrary θ_E', θ_F' over S (and any E_X, F_X):

B[θ_F', X -> {X:F_X}] ~ A₁[θ_E', X -> {X:E_X}] (by (*)), A₁[θ_E', X -> {X:E_X}] ~ A₁[θ_E, X -> {X:C}] (by (‡) on A₁), A₁[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}] (one direction of Â₁ ~ B̂ \ S), B[θ_F, X -> {X:D}] <: A₂[θ_E, X -> {X:C}] (assumption B̂ <: Â₂ \ S).

Chaining: B[θ_F', X -> {X:F_X}] <: A₂[θ_E, X -> {X:C}].

The RHS still has θ_E over S and X fixed to C. We need it varying over S ∪ {X}, i.e., need RHS to be A₂[θ_E', X -> {X:E_X}] for arbitrary choices.

And here’s the question: is A₂[θ_E, X -> {X:C}] ~ A₂[θ_E', X -> {X:E_X}]?

We don’t have a substitution-invariance result for A₂ directly, because we don’t have an (*) -style equivalence for A₂ and B. We only have B̂ <: Â₂ (one direction).

The honest answer: I don’t see how to close this gap with Lemma 1 alone

The issue is that Lemma 1 (substitution-invariance via transitivity) requires full equivalence to bootstrap. It gave us substitution-invariance of A₁ and B because they’re equivalent. But A₂ is only known to be a supertype of B̂ — we don’t have equivalence to chain through.

Without further structural information about A₂, we can’t conclude that A₂ is substitution-invariant in X.

So the proof seems to genuinely get stuck here, even with the cleaner lemma statement.

Where this leaves us

I think this means one of the following is true:

(a) The lemma is false as stated. There’s a counterexample where the equivalence holds for specific (C, D) and arbitrary substitutions over S, but does not hold when you also vary X independently. I haven’t constructed one, but the proof obstruction suggests we should look for one.

(b) The lemma needs additional structural reasoning. Beyond Lemma 1, we need an argument that A₂ ‘s X -occurrences are constrained by the surrounding equivalence — perhaps via an explicit Lemma 2 (“absorption from substitution-invariance”) that I previously sketched, applied to A₂ ‘s relationship with the equivalence class of Â₁ ~ B̂.

(c) The induction needs to be on a different measure. Maybe instead of inducting on |A| + |B|, we should induct on the derivation of the avoidance-equivalence, or on a more refined measure that handles the intersection case differently.

Trying for a counterexample

Let me genuinely attempt option (a) to see if the lemma fails.

I need A = A₁ & A₂, B ordinary, Â₁ ~ B̂ \ S (so (*) holds for (A₁, B)), B̂ <: Â₂ \ S (one direction), Â₂ <: B̂ fails, and C ≁ D. Then I need to show that the conclusion fails — i.e., that varying X independently breaks the equivalence.

Attempt: S = ∅ (start simple), A₁ = Top -> Top, A₂ = X -> Top, B = Top -> Top, C, D arbitrary with C ≁ D.

Â₁ = Top -> Top, B̂ = Top -> Top. Â₁ ~ B̂ ✓.
Â₂ = {X:C} -> Top. B̂ <: Â₂: contravariantly {X:C} <: Top ✓, covariantly Top <: Top ✓. ✓
Â₂ <: B̂: contravariantly Top <: {X:C} ✗.
Â ~ B̂: <: direction picks Â₁; :> direction needs B̂ <: Â₁ ✓ and B̂ <: Â₂ ✓. So Â ~ B̂ ✓.
IH on (A₁, B): A₁ ~ B \ {X} semantically — yes, Top -> Top ~ Top -> Top regardless of X -substitution.

Now check conclusion: A ~ B \ {X} semantically, i.e., for arbitrary E, F, (Top -> Top) & ({X:E} -> Top) ~ Top -> Top.

<: direction: pick Top -> Top ✓.
:> direction: Top -> Top <: (Top -> Top) & ({X:E} -> Top). Need Top -> Top <: Top -> Top ✓ and Top -> Top <: {X:E} -> Top (contravariantly {X:E} <: Top ✓). ✓

So conclusion holds. ✓ The lemma’s not falsified here.

The issue is that even though my proof argument has a gap (couldn’t lift A₂ ‘s side via Lemma 1 alone), the conclusion still happens to hold because A₂ ‘s X is in absorbed position. The proof needs to capture this absorption argument explicitly.

So we do need something like Lemma 2 after all

I want to retract my earlier agreement that Lemma 2 isn’t necessary. Looking at it now, I think:

Lemma 1 (substitution-invariance from equivalence) gives you invariance of A₁ and B.
Lemma 2 (absorption from substitution-invariance) is needed to argue that derivations involving B lift to arbitrary substitutions, even on the A₂ side, because A₂ ‘s X -occurrences in the comparison B̂ <: Â₂ must match B̂ ‘s structure, and B̂ ‘s X -occurrences are absorbed.

The single-variable proof I sketched earlier implicitly relied on Lemma 2 — I was waving at it as “structural argument.” For the multi-variable version, this absorption argument needs to be explicit.

My current honest position

I think the lemma in your formulation is probably true, but proving it requires more than transitivity. Specifically:

Lemma 1 generalized (substitution-invariance of equivalent types over the avoidance set) — proved by transitivity. ✓
Lemma 2 generalized (when B is substitution-invariant over a set, derivations B̂ <: T lift to arbitrary substitutions over that set, for any T) — needs a careful structural proof, by induction on the derivation. This is the technically delicate piece.
Combining them in the stuck case — Lemma 1 gives B ‘s invariance from (*); Lemma 2 lifts B̂ <: Â₂ to arbitrary substitutions over S ∪ {X}.

Without Lemma 2, the proof stops at the gap I identified. With Lemma 2 (proven separately), the proof goes through.

So I’d reverse my earlier statement: Lemma 2 is necessary, and it does extend to multi-variable. The multi-variable extension is essentially: “if B is substitution-invariant over a set S, then in any derivation B̂ <: T, the S -bound substituents in B̂ appear only in absorbed positions, and the derivation lifts to arbitrary substituents over S.”

Would you like to try to prove Lemma 2 next, or explore whether a different proof strategy (e.g., induction on something other than type size) can avoid needing it?

You

Pf for (SemInv-Multi). If A[X -> {X:C}] ~ B[X -> {X:D}] \ S (semantic equivalence under avoidance set S), then either:

C ~ D, or
A ~ B \ S ∪ {X} (semantic equivalence under extended avoidance set).

Case A = A1 & A2 and B = ordinary B Premise: ∀θ_E, θ_F over S. (A[X → {X:C}])[θ_E] ~ (B[X → {X:D}])[θ_F]

Another issue I note here, regardless of how the proof is done, is with the semantic version of the rule: there is no way to do a precise case analysis on how the proof derivation is going, while with the syntactic restriction, it is clear that we can either take the left or right branch of the intersection.

Claude

An Important Observation About Proof Structure

You’ve identified something genuinely important, and I want to take it seriously rather than wave it away. Let me think about this carefully.

The issue you’re raising

In a syntactic system, the equivalence (A₁ & A₂)[θ] ~ B[θ'] would be derived by some specific proof tree, and we’d case-split on the last rule used. We could say: “the <: direction was derived by <:∩L1 (picking A₁) or <:∩L2 (picking A₂), let me handle each case.”

In the semantic version, the premise is just ∀θ. ...equivalence holds.... There’s no proof tree to case-split on — equivalence is a property, not a derivation. So when we want to argue “we know Â₁ ~ B̂ (the left branch participates) versus Â₂ ~ B̂ (the right branch participates),” there’s no syntactic hook to grab.

This is a legitimate technical obstacle and not just a notational annoyance.

Why this matters concretely

In my “proof” above, I wrote:

Stuck case: Â₁ ~ B̂ \ S (full mutual avoidance-equivalence), B̂ <: Â₂ \ S (one-directional only), Â₂ <: B̂ \ S fails.

But this case split is implicitly using the syntactic structure of how subtyping might be derived: “either the left or right branch of A ‘s intersection participates fully, or only partially.” In the semantic formulation, I can’t make this case split — I just have a ∀θ. equivalence premise, and I have to extract structural information from it without recourse to a derivation.

So when I tried to apply IH to (A₁, B) to get (*) — that move was already syntactic in spirit. Let me check what’s actually justified semantically.

What can I extract semantically?

From the premise ∀θ_E, θ_F over S. (A₁ & A₂)[θ_E, X -> {X:C}] ~ B[θ_F, X -> {X:D}]:

For each fixed pair (θ_E, θ_F):

<: direction: (A₁ & A₂)[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}]. By intersection elimination on the left, this means either A₁[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}] or A₂[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}].
:> direction: B[θ_F, X -> {X:D}] <: A₁[θ_E, X -> {X:C}] and B[θ_F, X -> {X:D}] <: A₂[θ_E, X -> {X:C}]. (This is forced because the right side is an intersection.)

Note: which branch wins on the <: direction may depend on (θ_E, θ_F). For some choices of substituents, branch 1 might satisfy <:; for others, branch 2 might. There’s no guarantee of uniformity.

This is the semantic shadow of “intersection elimination on the left is non-deterministic in the syntactic system, and the choice may differ across instances.”

Re-examining the case structure

So semantically, the partition isn’t “either Â₁ ~ B̂ (and Â₂ is just along for the ride) or Â₂ ~ B̂ (and Â₁ is along for the ride).” It’s something more like:

For every (θ_E, θ_F), either A₁[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}] or A₂[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}] (possibly varying).
Always B[...] <: A₁[...] and B[...] <: A₂[...].

And from these we want to extract enough to apply the IH.

Can we apply IH to (A₁, B)?

The IH on (A₁, B) requires the premise A₁[X -> {X:C}] ~ B[X -> {X:D}] \ S semantically, i.e., ∀θ_E, θ_F over S. A₁[θ_E, X -> {X:C}] ~ B[θ_F, X -> {X:D}].

From the outer premise, we have the :> direction for free: ∀θ_E, θ_F. B[...] <: A₁[...] — yes, because of the universal :> direction noted above.

The <: direction is the problem: we’d need ∀θ_E, θ_F. A₁[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}]. But the outer premise only gives us “either branch 1 or branch 2 wins” per instance — possibly varying.

So the IH on (A₁, B) is not directly applicable. Same for (A₂, B). This is a real semantic obstacle the syntactic system would have hidden.

How might we handle this?

Let me think about what semantic structure we can extract. Define:

T₁ = {(θ_E, θ_F) : A₁[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}]}
T₂ = {(θ_E, θ_F) : A₂[θ_E, X -> {X:C}] <: B[θ_F, X -> {X:D}]}

The premise tells us T₁ ∪ T₂ = all pairs, and additionally that the :> direction holds universally.

Now several cases:

(a) T₁ is everything. Then A₁[X -> {X:C}] ~ B[X -> {X:D}] \ S semantically — IH applies, get C ~ D or A₁ ~ B \ S ∪ {X}. From there, the proof proceeds as before (with the same gap I identified about lifting A₂ ‘s side).

(b) T₂ is everything. Symmetric.

(c) Neither — both T₁ and T₂ are proper subsets. For some pairs, branch 1 wins; for others, branch 2. This is the semantic case the syntactic system can’t see directly.

In case (c), we have a harder structural situation. Neither A₁ nor A₂ alone is <:-equivalent to B semantically; the equivalence comes from the combination A₁ & A₂ together being able to find a winning branch for any substitution.

A concrete instance of case (c)

Let me try to construct case (c). I need A₁, A₂, B where A₁ & A₂ ~ B for specific (C, D), but neither A₁ ~ B nor A₂ ~ B alone, and the “which branch wins” varies.

B = Top -> Top (ordinary).

A₁ = X -> Top. Then Â₁ = {X:C} -> Top. Â₁ <: B̂: contravariantly Top <: {X:C} ✗ (where the original B̂ = Top -> Top, contravariant means we need Top <: {X:C} for B̂ <: Â₁… wait, I’m getting confused. Let me redo.)

Â₁ <: B̂ = {X:C} -> Top <: Top -> Top: contravariantly Top <: {X:C} ✗ (not generally).

So Â₁ <: B̂ fails. Then for Â <: B̂, we must use Â₂ <: B̂.

Let A₂ = Top -> Top. Then Â₂ = Top -> Top = B̂, and Â₂ <: B̂ ✓. Â <: B̂ ✓ via branch 2.

B̂ <: Â₁: Top -> Top <: {X:C} -> Top: contravariantly {X:C} <: Top ✓, covariantly Top <: Top ✓. ✓ B̂ <: Â₂: Top -> Top <: Top -> Top ✓.

So Â ~ B̂ with the <: direction handled by branch 2 always (since branch 1 always fails). This is case (b), not (c).

Hmm, let me try harder for case (c).

A₁ = X -> Top, A₂ = Top -> X, B = Top -> Top.

Â₁ <: B̂: {X:C} -> Top <: Top -> Top: Top <: {X:C} ✗.
Â₂ <: B̂: Top -> {X:C} <: Top -> Top: Top <: Top ✓ (contra), {X:C} <: Top ✓ (cov). ✓
So branch 2 always wins for <:.

Still case (b). It seems hard to make case (c) arise with B ordinary, because the <: direction depends on structural matching, and for a single fixed B ordinary type, structural compatibility is determined by A_i ‘s shape, not by the substituents.

Let me try where the substituents matter:

A₁ = X, A₂ = X, B = X. Then Â₁ = Â₂ = {X:C}, B̂ = {X:D}. Both branches do the same thing. Â ~ B̂: requires {X:C} <: {X:D} (forces C <: D) and {X:D} <: {X:C} (forces D <: C). So C ~ D. (Branch 1 of lemma — done.)

Try A₁ = X, A₂ = Top, B = X:

Â₁ <: B̂: {X:C} <: {X:D} requires C <: D.
Â₂ <: B̂: Top <: {X:D} ✗.
For Â <: B̂: need branch 1: C <: D.
B̂ <: Â: needs B̂ <: Â₁ ({X:D} <: {X:C}, i.e., D <: C) and B̂ <: Â₂ ({X:D} <: Top ✓).
So C ~ D forced. Branch 1 of lemma.

Try A₁ = X -> Top, A₂ = Y -> Top, B = X & Y -> Top… but wait, the question is about avoiding X (or extending S with X). Adding Y would be a different variable. Let me stick with single X.

Honestly, I’m having trouble constructing a clean case (c). Let me think about why.

Why case (c) might be rare or impossible

For case (c) to arise:

For some (θ_E, θ_F), A₁[θ_E, X→{X:C}] <: B[θ_F, X→{X:D}] but A₂[θ_E, X→{X:C}] <: B[θ_F, X→{X:D}] fails.
For other (θ_E', θ_F'), the reverse: A₁ fails, A₂ succeeds.

But B[θ_F, X→{X:D}] is the same B̂ regardless of whether the LHS is A₁ or A₂. Whether A₁[θ_E, ...] <: B̂ holds depends on structural matching between A₁[θ_E, ...] and B̂. The substituents θ_E affect A₁ ‘s shape but not B̂ ‘s shape (since θ_F is what affects B).

For A₁ ‘s success/failure to vary across θ_E, the substituents must affect whether A₁ structurally matches B. But structural matching at the top level is determined by the type’s outer constructor (arrow vs. label vs. variable), and substitution doesn’t change the outer constructor (it replaces label-bodies).

So whether A₁ <: B̂ succeeds is a function of the combination of A₁ ‘s shape and the substituents’ details (label-body comparisons), but the top-level structural matching is independent of substitution.

If A₁ ‘s top-level shape is incompatible with B ‘s top-level shape (e.g., A₁ is a label, B is an arrow), then A₁ <: B̂ fails for all θ_E. If they’re compatible at the top level, success/failure depends on recursive matching.

Conjecture: For B ordinary (not intersection), and A₁, A₂ both non-intersection (or with their intersections analyzed), the success/failure of Aᵢ <: B̂ is essentially uniform across θ_E — it either always holds or always fails.

If this is true, then case (c) doesn’t arise: either A₁ <: B̂ always (case (a)), or A₂ <: B̂ always (case (b)), and we’re back to applying IH on whichever branch always works.

Let me check this conjecture more carefully. Suppose A₁ is an arrow A₁_dom -> A₁_cod and B is an arrow B_dom -> B_cod. Then A₁[θ] <: B[θ'] iff B_dom[θ'] <: A₁_dom[θ] and A₁_cod[θ] <: B_cod[θ']. Whether these hold depends recursively on shapes and substituents.

It’s possible — maybe — for the recursive comparison to hold for some (θ, θ') and fail for others. Let me try:

A₁ = X -> Top, B = (X -> Top) -> Top… but B is (X -> Top) -> Top with substitution B[X→{X:D}] = ({X:D} -> Top) -> Top, which is a complex arrow. Hmm.

Let me try simpler: A₁ = (Top -> X) -> Top, B = (Top -> Top) -> Top. Then A₁[θ] = (Top -> {X:E_X}) -> Top, B[θ'] = (Top -> Top) -> Top. A₁[θ] <: B[θ'] requires B[θ']_dom <: A₁[θ]_dom, i.e., Top -> Top <: Top -> {X:E_X}, i.e., Top <: {X:E_X} ✗ for non-Top E_X.

So A₁ <: B̂ fails for substituents giving non-Top label bodies but might work otherwise. Hmm, but the substituents are arbitrary types (we’re substituting {X:E_X} where E_X is any type).

Actually, since E_X is a free type, Top <: {X:E_X} holds iff… well, Top is the supertype of everything, so we need {X:E_X} <: ... no wait, we have Top <: {X:E_X}, which holds only if {X:E_X} is Top or above, which it isn’t. So this fails for all E_X.

So uniformly fails. Doesn’t give case (c).

Let me try where the substitution helps:

A₁ = X -> Top, B = Y -> Top… but Y isn’t in our setup.

A₁ = X, B = X: A₁[θ] = {X:E_X}, B[θ'] = {X:F_X}. A₁ <: B: E_X <: F_X. Holds for some (E_X, F_X) (e.g., when E_X = F_X), fails for others.

So if A₁ = X, B = X, then A₁ <: B̂ varies by substitution. Now if A₂ is something else… but we’d need the equivalence to hold under the avoidance set semantics, meaning uniformly for arbitrary substituents. So A₁ <: B̂ would need to hold uniformly, meaning E_X <: F_X for all E_X, F_X — false. So A₁ <: B̂ doesn’t hold uniformly. Then we’d need A₂ <: B̂ to hold for the substituents where A₁ fails — meaning A₂ <: B̂ holds for those, and possibly the others too.

If A₂ <: B̂ holds uniformly, we’re in case (b). If not, we’re in genuine case (c).

Try: A₂ = Top -> Top, B = X. A₂[θ] = Top -> Top, B[θ'] = {X:F_X}. A₂ <: B̂: Top -> Top <: {X:F_X} — different head constructors, fails.

So A₂ <: B̂ fails uniformly. But we needed it to hold for some substituents. Bad setup.

It’s looking like case (c) really is hard to construct, and might require some delicate orchestration. Conjecture remains: with B ordinary, case (c) doesn’t arise.

What I think this means

Two options:

(i) Prove case (c) doesn’t arise, leveraging the structural rigidity of B ordinary. Then the proof reduces to cases (a) and (b), which are essentially the syntactic case-split.

(ii) The lemma is actually false in general, and case (c) provides a counterexample we haven’t yet constructed. The semantic formulation might be too strong — equivalent to a syntactic system that’s also too strong, having too many rules.

(iii) Generalize the lemma’s conclusion to handle case (c) — perhaps the disjunction needs a third branch like “the equivalence is decomposable as a finite case-split where each piece falls into branch 1 or 2.”

I think (i) is most likely, and worth attempting a clean proof of. The structural intuition: with B ordinary, the comparison A₁ <: B̂ factors through A₁ ‘s top-level structure and B ‘s top-level structure, and the variability across substituents is determined by recursive comparisons that, by induction, can be made uniform.

Returning to your point

You’re right that the semantic formulation loses the fine-grained case-analysis hook of the syntactic system. To recover it, we need either:

A “uniformity” lemma that says: in case (c) — where different branches win for different substituents — actually this can’t happen given other constraints in the lemma (e.g., B ordinary, A decomposable). This restores effective case (a)/(b) split.
A different proof technique that doesn’t require this case split. Perhaps directly constructing the conclusion’s universal statement without going through IH on a sub-case.
Prove via the syntactic system instead, then transport via soundness/completeness. If the syntactic proof is more tractable for the case-split reasons, prove there and transport.

Question back to you

In your paper’s framework, what does the syntactic case-split for the stuck case look like? You said earlier:

By intersection subtyping inversion, and assuming B is ordinary (not intersection type) we only get either A1[C] < B[D] or A2[C] < B[D] and B[D] < A1[C] and B[D] < A2[C]

The “or” here is at the level of derivations — the syntactic derivation must commit to one branch. So syntactically, we know for the specific (C, D) of the premise, either Â₁ <: B̂ or Â₂ <: B̂. We pick one (say the first) and proceed.

Importantly: this is for specific (C, D). The semantic version with \ S requires uniformity over θ_E, θ_F. So the syntactic-friendly reformulation of the lemma might be: at the meta-level, you do induction over the original derivation rather than over types-with-universal-substitution. That is, the syntactic system is doing something the pure semantic formulation can’t easily mimic.

Maybe the right move is:

Reformulate the lemma syntactically, with the avoidance set defined via syntactic rules (your original idea), and prove the lemma by induction on derivations. Then separately prove that the syntactic and semantic definitions of \ S coincide, as a soundness/completeness pair.

The syntactic proof can do precise case-splits on derivation rules. The semantic version is the cleaner specification. The two sides connect via the soundness/completeness theorem.

This is a pretty common pattern: define the property semantically (for specification clarity), define the property syntactically (for proof tractability), and prove they coincide. The hard inversion lemma might be much more tractable on the syntactic side.

What do you think? Does this match the structure of how your paper develops the system? And does the prospect of “prove syntactic version, then prove syntactic = semantic” seem feasible given your earlier worry about the syntactic version’s stuck case?

You

If I have syntactic premise and semantic conclusion, and also property that syntactic rules are sound (implies semantic spec)

A1 & A2 [C] ~ B [D] / S

for case: A1 [C] <: B [D] / S B[D] <: A1[C] / S B[D] <: A2[C] / S

From IH we have either D ~ C or A1 ~ B / S + X semantically

We need to show A1 & A2 ~ B / S + X semantically <: direction follows directly for the:> direction, missing goal: B <: A2 / S + X(semantically) expands to for any Sc Sd whose domain is S + X, B[Sc, X → E] <: A2[Sd, X → F] in premise we have B[Sc, X → D] <: A2[Sd, X → C] E, F is any, C, D is fixed…

semantic spec no help…? where does the transitivity reasoning come into play? I thought it would help

Claude

Where Transitivity Helps and Where It Doesn’t

You’ve isolated the question very precisely. Let me work through it carefully.

Setting up exactly what we have

Premise (specific C, D, syntactic):

A₁[C] <: B[D] \ S (syntactic)
B[D] <: A₁[C] \ S (syntactic)
B[D] <: A₂[C] \ S (syntactic)

By soundness, each of these implies the corresponding semantic statement: 1’. ∀(σ_E, σ_F over S). A₁[σ_E, X→{X:C}] <: B[σ_F, X→{X:D}] 2’. ∀(σ_E, σ_F over S). B[σ_F, X→{X:D}] <: A₁[σ_E, X→{X:C}] 3’. ∀(σ_E, σ_F over S). B[σ_F, X→{X:D}] <: A₂[σ_E, X→{X:C}]

Combining 1’ and 2’: ∀(σ_E, σ_F). A₁[σ_E, X→{X:C}] ~ B[σ_F, X→{X:D}] — semantic equivalence with X pinned to C / D and other S -variables varying.

By IH on (A₁, B): either C ~ D (done), or A₁ ~ B \ S ∪ {X} semantically, i.e., (*) ∀(σ_E', σ_F' over S ∪ {X}). A₁[σ_E'] ~ B[σ_F']

Goal: B <: A₂ \ S ∪ {X} semantically, i.e., (Goal) ∀(σ_E', σ_F' over S ∪ {X}). B[σ_F'] <: A₂[σ_E']

Where transitivity does help

Pick any σ_E', σ_F' over S ∪ {X}. Write σ_F' = (σ_F, X→{X:F_X}) where σ_F is over S. Similarly σ_E' = (σ_E, X→{X:E_X}).

From (*), substitution-invariance of B over S ∪ {X} (Lemma 1 generalized): (†) B[σ_F'] ~ B[σ_F'', X→{X:F_X''}] for any σ_F'' and F_X''.

In particular, taking σ_F'' = σ_F (any) and F_X'' = D:

B[σ_F'] ~ B[σ_F, X→{X:D}] for any σ_F over S.

Combining with 3’:

B[σ_F'] ~ B[σ_F, X→{X:D}] <: A₂[σ_E, X→{X:C}], so B[σ_F'] <: A₂[σ_E, X→{X:C}] for any σ_E over S.

Transitivity has gotten us this far: the LHS is now arbitrary over S ∪ {X}, but the RHS still has X pinned to C.

Where transitivity stops helping

We want B[σ_F'] <: A₂[σ_E', X→{X:E_X}] for arbitrary E_X. We have B[σ_F'] <: A₂[σ_E, X→{X:C}] for arbitrary σ_E.

To bridge this gap via transitivity, we’d need:

A₂[σ_E, X→{X:C}] <: A₂[σ_E', X→{X:E_X}] for arbitrary E_X (or equivalence).

This would be substitution-invariance of A₂ over X. But we don’t have it — we only have substitution-invariance of A₁ and B (from (*) ‘s symmetric application of Lemma 1).

Without substitution-invariance of A₂, transitivity has no purchase here. This is the structural reason your proof is stuck: A₂ is in the equivalence “asymmetrically” — it’s bounded above by B (we have B <: A₂) but not equivalent to B. So we have no chain to ride.

Why the semantic spec really does seem to give no help

You wrote:

E, F is any, C, D is fixed… semantic spec no help…?

I want to validate this honestly: the semantic spec, used naively, really does not help for the missing piece. The missing piece is a property of A₂ that the equivalence-with- B premise simply doesn’t constrain (because A₂ is only <:-bounded by B, not equivalent).

This is genuinely the same gap I identified before, and it’s not closable by Lemma 1 / transitivity alone.

So is the lemma false?

Let me check whether A₂ actually needs to be substitution-invariant for the lemma’s conclusion to hold, or whether something weaker suffices.

The conclusion needs: B[σ_F'] <: A₂[σ_E'] for arbitrary σ_F', σ_E' over S ∪ {X}. We have B[σ_F'] <: A₂[σ_E, X→{X:C}] for arbitrary σ_E (from the chain above). The gap is varying X ‘s substituent on A₂ ‘s side.

A₂ doesn’t need to be fully substitution-invariant. It just needs the property that B[σ_F'] <: A₂[σ_E, X→{X:C}] implies B[σ_F'] <: A₂[σ_E', X→{X:E_X}] for arbitrary σ_E', E_X.

This is weaker than full substitution-invariance of A₂ — it’s a one-sided “the comparison B <: A₂ is insensitive to X ‘s substituent” property.

Why might this property hold? Because in any derivation of B̂ <: Â₂, the X -occurrences (which all arise from substitution) appear somewhere in the derivation. If they appear in absorbed positions (subsumed by Top or dropped via intersection elimination on the left), the derivation lifts to arbitrary substituents. If they appear in non-absorbed positions, they participate in label comparisons with whatever is on the other side of <:.

Recall: B ‘s X -occurrences in B̂ come from substituting X→{X:D} in B ‘s structure. By Lemma 1, B ‘s equivalence class is invariant under X ‘s substituent — so by the intuition that we’ve discussed (and that Lemma 2 makes precise), B ‘s X -occurrences are in absorbed positions. So in the derivation B̂ <: Â₂, B ‘s X -occurrences are absorbed.

But what about A₂ ‘s X -occurrences? These come from substituting X→{X:C} in A₂ ‘s structure. Are they in absorbed positions?

If A₂ ‘s X -occurrence on the right of <: is in covariant position (e.g., A₂ has X covariantly), then the comparison B̂ <: A₂[..., X→{X:C}] requires B̂ to provide a covariant matching — i.e., B̂ must have something <: {X:C} in the corresponding position. Since B̂ ‘s X -occurrences are in absorbed positions, the corresponding position in B̂ must contain something other than an X -bearing label — call it R, with R <: {X:C}.

For R <: {X:C} to hold (where R doesn’t contain X in active position): R could be a label {X:R'} with R' <: C — but R shouldn’t have active X, so R doesn’t have {X:_} in this position structurally — contradiction unless R is precisely the substituted-in label, but we said B ‘s X doesn’t appear here.

OR R could be an intersection containing {X:R'}, but then B ‘s structure would have X actively here, contradicting absorption.

OR there’s no such R (R = Top would not satisfy Top <: {X:C}).

So B̂ <: A₂[..., X→{X:C}] with A₂ having active covariant X and B̂ having X only in absorbed positions is actually impossible — except via absorption on A₂ ‘s side too.

Specifically, the comparison succeeds only if A₂ ‘s X -position is also in absorbed position relative to the comparison — e.g., the entire branch of A₂ containing this X is dropped via intersection elimination on the right, or a containing Top absorbs the whole thing.

Conclusion: A₂ ‘s X -occurrences in any successful derivation B̂ <: Â₂ must be in absorbed positions (relative to the comparison structure). Therefore the derivation lifts to arbitrary X -substituents on A₂ ‘s side too.

This is the absorption argument I was calling Lemma 2, and it’s necessary for the proof. Transitivity / Lemma 1 alone can’t establish this — it’s a structural property of derivations.

So the answer is:

Transitivity does help — it gives substitution-invariance of B. But to close the proof, you also need a structural argument (Lemma 2-style) that uses the substitution-invariance of B to conclude that A₂ ‘s X -occurrences in the derivation B̂ <: Â₂ are absorbed.

Without Lemma 2, the proof is genuinely stuck at the gap you identified. With Lemma 2, the proof closes.

Now, your earlier observation about case-analysis difficulty

You raised an important point: the semantic version doesn’t let you case-split on which intersection branch wins. But Lemma 2’s proof would naturally be done by induction on a derivation — and to do that, you’d need a derivation in hand, not just a semantic statement.

This suggests that Lemma 2 should be proven for the syntactic system (A <: B \ S syntactic with derivation case-splits), and then transferred to the semantic system via soundness.

So the structure becomes:

Define \ S syntactically.
Prove syntactic \ S is sound w.r.t. semantic \ S (easy, by structural induction).
Prove Lemma 2 syntactically: if B <: T \ S (syntactic) and B ‘s X -occurrences are absorbed (a syntactic notion), then for any E, F, B[X→{X:F}] <: T[X→{X:E}] \ S (syntactic). By induction on the derivation.
Combine: in the inversion lemma proof, after getting B ‘s syntactic substitution-invariance from (*) and soundness, apply Lemma 2 to lift B̂ <: Â₂ syntactically, then use soundness to get the semantic conclusion.

This route uses the syntactic system precisely where the semantic system fails to give traction (case analysis on derivations) and uses the semantic system where it gives the cleanest specification.

My honest summary

Transitivity (Lemma 1) gets you halfway — substitution-invariance of A₁ and B. It doesn’t get you the rest. The remaining gap is closed by an absorption-propagation argument (Lemma 2), which needs to be proved by induction on a syntactic derivation.

The semantic version of the lemma is provable, but the proof goes via the syntactic system — at least, that’s the cleanest route I can see. The pure-semantic proof attempt gets stuck precisely at the case-analysis-on-intersection issue you identified.

Your instinct that the semantic spec doesn’t help here is correct. Transitivity helps for one side (B ‘s) but not the other (A₂ ‘s), and bridging the gap requires structural reasoning that’s most naturally done syntactically.

Concrete suggestion for moving forward

I think the most productive next step is to write out Lemma 2 carefully in syntactic form:

Lemma 2 (Absorption Propagation, syntactic). Let B be a type and suppose B[X→{X:F}] ~ B[X→{X:F'}] \ S (syntactic) for all F, F' (this is the substitution-invariance property). Then for any type T and any derivation B[X→{X:F}] <: T \ S, we have B[X→{X:F''}] <: T \ S for any F''.

Or perhaps more carefully:

Lemma 2’. If B[X→{X:F}] <: T \ S and B ‘s X -occurrences are syntactically in absorbed positions (a structural property defined inductively), then B[X→{X:F'}] <: T \ S for any F'.

The exact statement depends on how you formalize “absorbed position.” Working this out should be the next step.

Would you like to try formalizing the syntactic absorption notion together, or is this the kind of thing that’s now clear enough that you can take it from here?

You

For your proposed Lemma 2 isn’t the A2 side that needs rewriting? we can already rewrite B side substituion at will thanks to Lemma 1?

Claude

You’re Right — Let Me Restate Lemma 2 Correctly

You’ve caught a real error in my framing. Let me redo this carefully.

What we have and what we need

After applying transitivity (Lemma 1), the situation is:

Have: B[σ_F'] <: A₂[σ_E, X→{X:C}] for arbitrary σ_F' over S ∪ {X} and arbitrary σ_E over S.

Want: B[σ_F'] <: A₂[σ_E', X→{X:E_X}] for arbitrary σ_F' over S ∪ {X} and arbitrary σ_E' over S ∪ {X} (i.e., arbitrary E_X too).

So the B -side is already free (Lemma 1 took care of it), and what we need to vary is the A₂ -side substituent for X.

So Lemma 2 should be about lifting A₂ ‘s side, not B ‘s side. My framing was wrong — apologies.

Correctly stated Lemma 2

Lemma 2 (Absorption Propagation, corrected). Suppose we have a derivation B' <: A₂[X→{X:C}] \ S where B' is some specific type (here, B[σ_F'] for any σ_F'). If A₂ ‘s X -occurrences (the ones substituted to {X:C}) appear only in absorbed positions in this derivation, then B' <: A₂[X→{X:E_X}] \ S for any E_X.

This is the real lemma we need: lifting the right-hand side’s X -substituent to be arbitrary, given that the right-hand side’s X -occurrences are absorbed in the derivation.

Why it should be true

Intuitively: if A₂ ‘s X -occurrences are in absorbed positions in the derivation B' <: Â₂, those positions are handled by Top -absorption rules or intersection-elimination rules — neither of which inspects the body of the absorbed/dropped subexpression. So replacing {X:C} with {X:E_X} in those positions doesn’t change whether the derivation rules apply.

Proof would be by induction on the derivation of B' <: A₂[X→{X:C}] \ S. At each step, look at the rule applied:

If the rule is _ <: Top and the right side is Top (with {X:C} being absorbed inside something that just became Top): replacing {X:C} by {X:E_X} doesn’t affect this — the absorbing Top is still Top.
If the rule is intersection elimination (or the dual), and the dropped branch contains {X:C}: replacing {X:C} by {X:E_X} doesn’t affect the kept branch.
If the rule is structural (arrow, label) and {X:C} appears in a recursive subgoal: the subgoal must itself be one where {X:C} is in absorbed position, so by IH it lifts.
If the rule is the label rule {X:E} <: {X:F} requiring E <: F, and the {X:C} on the right is being matched: this would be a non-absorbed position, contradicting our hypothesis.

So the induction goes through, provided the hypothesis ” A₂ ‘s X -occurrences are in absorbed positions in this derivation” holds.

The remaining question: do A₂’s X-occurrences land in absorbed positions?

This is what I argued informally before, and I want to redo it more carefully now.

Given:

B[σ_F'] <: A₂[X→{X:C}] (with σ_E over S arbitrary; let’s fix one for now and call it σ_E*).
B ‘s X -occurrences in B[σ_F'] are in absorbed positions (because B is substitution-invariant in X — by the substitution-invariance lemma applied to (*)).

In the derivation, when we trace B[σ_F'] <: A₂[σ_E*, X→{X:C}], we walk down both types’ structures.

Consider any X -occurrence in A₂ (now {X:C}) on the right. It sits at some path in A₂ ‘s structure. The derivation has reached this position via some sequence of rule applications.

Case 1: the position is reached via covariant matching from the top. Then the derivation requires the left side (some descendant of B[σ_F']) to match it: must be <: {X:C}. Possible left-side shapes:

Top: Top <: {X:C} ✗.
A different head constructor (arrow, etc.): fails.
A label {X:R} with R <: C: succeeds. But this requires B to have X (as a variable) in the matching structural position — and by B ‘s substitution-invariance, B ‘s X -occurrences are in absorbed positions, so they shouldn’t appear in active matching positions. Contradiction, unless this label match doesn’t actually arise.
Intersection on the left, with one branch being {X:R}: same issue.

So Case 1 is impossible (or, more precisely, would force B to violate its substitution-invariance).

Case 2: the position is reached via contravariant matching. Then the derivation requires the right side to be <: something on the left side. So {X:C} on the right of <: actually means {X:C} is on the left of an inner <: (after contravariant flip). It needs {X:C} <: something. If something is Top, then {X:C} <: Top ✓ — this is absorption. If something is a matching label, requires comparing label bodies. If the matching label is {X:R} from B ‘s side, then… again B ‘s X would need to be in that position, contradicting absorption.

So in Case 2, the X -occurrence in A₂ is absorbed by Top — fine.

Case 3: the position is in a dropped intersection branch. Intersection elimination on the right (when comparing ... <: A₂[..., X→{X:C}] with A₂ an intersection containing X -bearing branch): you must satisfy all branches, so you don’t drop. But intersection elimination on the left of <: (when A₂ itself is on the right) doesn’t apply directly.

Wait, let me reconsider. We have B' <: A₂[..., X→{X:C}]. A₂ here is on the right. If A₂ is an intersection, we need the LHS to be <: each branch. So no dropping on the right.

If A₂ is not an intersection (say an arrow), but contains intersections in subpositions (e.g., A₂ = (T₁ & T₂) -> Top), then… contravariantly, comparing B'_dom <: T₁ & T₂ requires <: both branches, no dropping. Or comparing in covariant intersection-on-left position (rare). Hmm.

For an X -occurrence in A₂ to be in a “dropped intersection” position, the intersection containing it would need to be on the left of some inner subtyping comparison. Given A₂ is on the right of the outer <:, this happens when contravariance flips an inner intersection to the left position. E.g., A₂ = (X & T) -> Top: contravariantly, B'_dom <: X & T requires both, no dropping. Or A₂ = ((X & T) -> Top) -> Top: double-flip, T₁ -> Top <: (X & T) -> Top requires contravariantly X & T <: T₁ — left intersection elim, can drop one branch. So yes, X can be in a position that’s dropped via intersection elimination.

This is genuine absorption-via-dropping.

Net conclusion: A₂ ‘s X -occurrences in the derivation B' <: A₂[..., X→{X:C}] are either absorbed by Top (Case 2) or dropped by intersection elimination (Case 3). Case 1 is impossible because it would force B to violate substitution-invariance.

Putting it together

So:

B ‘s substitution-invariance comes from Lemma 1 (transitivity, free).
A₂ ‘s X -occurrences are in absorbed positions in any derivation B' <: A₂[..., X→{X:C}] — this is the structural fact that needs Lemma 2.
Therefore the derivation lifts to arbitrary E_X on A₂ ‘s side, giving the conclusion.

The lemma that does the heavy lifting is precisely:

Lemma 2 (corrected and final). If B[σ_F'] <: A₂[X→{X:C}] \ S (syntactic derivation) and B ‘s X -occurrences in B[σ_F'] are in absorbed positions (which holds when B is substitution-invariant in X), then for any E_X, B[σ_F'] <: A₂[X→{X:E_X}] \ S.

The proof is by induction on the derivation, with the case analysis I sketched: every X -occurrence on A₂ ‘s side must be in absorbed position, because otherwise the matching against B ‘s side would force B to have an active X, contradicting B ‘s substitution-invariance.

My corrected position

Thank you for the catch. The picture is:

Lemma 1 (transitivity): gives B ‘s substitution-invariance. Free.
Lemma 2 (absorption propagation, on A₂ ‘s side): lifts the X -substituent on A₂ ‘s side, given B ‘s substitution-invariance forces A₂ ‘s X -occurrences to be in absorbed positions in the derivation.

Together, the gap closes: starting from B[σ_F'] <: A₂[X→{X:C}] (with B -side already free thanks to Lemma 1), Lemma 2 lifts the A₂ -side to arbitrary E_X.

The key technical ingredient is the structural argument that A₂ ‘s X -occurrences must be absorbed — this is what B ‘s substitution-invariance buys us, but it’s not something pure transitivity gives. It requires looking at the derivation.

So the proof obligation is to prove Lemma 2 by induction on the syntactic derivation, leveraging the substitution-invariance of B to rule out the “bad” case (Case 1 above).

Does this revised picture match your intuition? And does Lemma 2 in this form look provable to you in your syntactic framework?

You

Problem, what will be the overall structure to prove the inversion lemma?

Pf for (SemInv-Multi). If A[X -> {X:C}] ~ B[X -> {X:D}] \ S (syntactic equivalence under avoidance set S), then either:

C ~ D, or
A ~ B \ S ∪ {X} (semantic equivalence under extended avoidance set).

In particular, how do we get the fact that B ‘s X -occurrences in B[σ_F'] are in absorbed positions (which holds when B is substitution-invariant in X),

Can you be more formal and careful about this? Is the absorption going to be a syntactic definiton or semantic spec?

Claude

Structuring the Inversion Lemma Proof

This is the right question to push on, and I want to be more careful than I have been. Let me think through the structure properly before committing to a formulation.

The basic tension

The inversion lemma has:

Syntactic premise: A[X→{X:C}] ~ B[X→{X:D}] \ S (a syntactic derivation exists).
Semantic conclusion (branch 2): A ~ B \ S ∪ {X} semantically (universal substitution property).

In the proof, the IH (recursive call) will have the same shape: syntactic premise → semantic conclusion. So when we apply IH to (A₁, B) we get A₁ ~ B \ S ∪ {X} semantically, not syntactically.

This matters for everything downstream: when we want to derive substitution-invariance of B (Lemma 1), we’re working with semantic equivalence, so Lemma 1 gives us semantic substitution-invariance of B.

Two routes for “absorption”

Route A: Semantic absorption. Define ” B ‘s X -occurrences are in absorbed positions” semantically as: B[X→{X:F}] ~ B[X→{X:F'}] (equivalence under varying substituent). This is exactly the conclusion of Lemma 1 applied to (*). So we get semantic absorption “for free” once (*) holds.

Route B: Syntactic absorption. Define an inductive predicate absorbed_X(T) on the syntax of T: roughly, X doesn’t appear in T, or every occurrence of X is “guarded” by Top (covariantly under Top, contravariantly above Top) or in an intersection branch that’s “redundant.” This is structural and inductively definable.

Route A is cleaner specification-wise. Route B is more useful for proof-by-induction-on-derivation.

Which route does Lemma 2 want?

Lemma 2’s job: given B[σ_F'] <: A₂[X→{X:C}] (a derivation) and ” B is absorbed in X,” conclude B[σ_F'] <: A₂[X→{X:E_X}] for arbitrary E_X.

If “absorbed” is semantic, then Lemma 2’s hypothesis is B[X→{X:F}] ~ B[X→{X:F'}] semantically — a universal property.

The proof of Lemma 2 will be by induction on the derivation B[σ_F'] <: A₂[X→{X:C}]. At each step, we case-analyze the rule. To rule out “bad” cases (where A₂ ‘s X would force B to have active X), we need to derive a contradiction with B ‘s substitution-invariance.

If B ‘s substitution-invariance is semantic, we’d need to show that the bad case implies B[X→{X:F}] ≁ B[X→{X:F'}] for some F, F', which is a semantic statement. This is doable but indirect: from a derivation snippet, build a semantic counter-instance.

If B ‘s substitution-invariance is syntactic (B[X→{X:F}] ~ B[X→{X:F'}] \ S syntactically for all F, F'), then the contradiction is more direct, but we’d need to know that the syntactic version holds.

Hmm, but we don’t a priori have syntactic substitution-invariance of B. We only have the semantic version (from IH + Lemma 1). So Lemma 2’s hypothesis must be semantic.

A cleaner reformulation

Let me reformulate Lemma 2 to take a semantic hypothesis and produce a semantic conclusion:

Lemma 2 (Semantic Absorption Propagation). Suppose:

For all F, F', B[X→{X:F}] ~ B[X→{X:F'}] semantically (equivalence under varying X -substituent on B).

B[X→{X:D}] <: T[X→{X:C}] for some specific C, D, and some type T. (Syntactic derivation exists, or just semantically.)

Then for all E_X, F_X, B[X→{X:F_X}] <: T[X→{X:E_X}] semantically.

If this holds, then the inversion lemma’s stuck case closes: take T = A₂, use hypothesis 1 from (*) +Lemma 1, hypothesis 2 from the soundness of the original syntactic premise, conclude what we need.

But the proof of Lemma 2… how does it go?

The proof of Lemma 2: a real attempt

Here’s where I want to be honest: I don’t think Lemma 2 has a clean proof without reaching into the syntactic structure somehow. Let me try.

Attempt 1: pure semantics. From hypothesis 1, vary B ‘s X -substituent freely. From hypothesis 2 (specific instance), get one inequality. We want to vary T ‘s X -substituent. There’s no direct semantic operation that does this — we need some structural reasoning about why T ‘s X -occurrences must be “harmless.”

Attempt 2: via syntactic derivation. Take a syntactic derivation of B[X→{X:D}] <: T[X→{X:C}]. Induct on it. At each rule, case-analyze. Use semantic substitution-invariance of B to rule out bad cases.

The “bad case” is: a rule application that compares B ‘s side’s {X:D} (or recursive descendant) with T ‘s side’s {X:C} via the label-comparison rule, forcing D <: C.

If this happens, then the derivation’s success depends on D <: C. By soundness (syntactic → semantic), the comparison B[X→{X:F}] <: T[X→{X:E}] for arbitrary F, E would similarly require F <: E — false. So semantically, the universal <: fails.

But hypothesis 1 is about ~ (mutual <:), not just <:. Does hypothesis 1 rule out this case?

Hypothesis 1: B[X→{X:F}] ~ B[X→{X:F'}]. This says B ‘s class is invariant under F. It does not directly say ” B ‘s X -occurrences are in absorbed positions in any derivation involving B.”

Wait, here’s the point: hypothesis 1 doesn’t directly constrain derivations like B <: T. It only constrains derivations like B <: B' (with B ‘s on both sides). So we can’t directly use hypothesis 1 to constrain how B ‘s side participates in B <: T.

Hmm. So Lemma 2 as stated might not even be true with hypothesis 1 alone.

Stepping back

Let me reconsider. Maybe Lemma 2 needs to be different.

Alternative formulation: Maybe Lemma 2 should be stated as a property of pairs:

Lemma 2’. If B[X→{X:F}] <: T[X→{X:E}] for all F, E (universal <:), then… [some conclusion useful for the inversion lemma].

But this is circular: this is exactly the conclusion we want.

Another alternative: Maybe we should prove the inversion lemma’s conclusion directly without going through an explicit Lemma 2, using only Lemma 1 (transitivity) and direct semantic reasoning.

Let me retry the stuck case with this in mind.

Revisiting the stuck case directly

Premise (after applying soundness):

(P1) ∀σ_E, σ_F over S. A₁[σ_E, X→{X:C}] ~ B[σ_F, X→{X:D}]
(P2) ∀σ_E, σ_F over S. B[σ_F, X→{X:D}] <: A₂[σ_E, X→{X:C}]

IH on (A₁, B) (assuming the C ~ D case is dispatched):

(*) ∀σ_E', σ_F' over S∪{X}. A₁[σ_E'] ~ B[σ_F']

By Lemma 1 from (*):

(L1_B) ∀σ_F', σ_F'' over S∪{X}. B[σ_F'] ~ B[σ_F'']
(L1_A1) ∀σ_E', σ_E'' over S∪{X}. A₁[σ_E'] ~ A₁[σ_E'']

Goal: ∀σ_E', σ_F' over S∪{X}. B[σ_F'] <: A₂[σ_E']

Pick arbitrary σ_E' = (σ_E, X→{X:E_X}) and σ_F' = (σ_F, X→{X:F_X}).

Chain:

B[σ_F'] ~ B[σ_F, X→{X:D}] by (L1_B).
B[σ_F, X→{X:D}] <: A₂[σ_E, X→{X:C}] by (P2).
So B[σ_F'] <: A₂[σ_E, X→{X:C}].

We need B[σ_F'] <: A₂[σ_E'] = A₂[σ_E, X→{X:E_X}]. The gap: A₂[σ_E, X→{X:C}] vs A₂[σ_E, X→{X:E_X}].

If we had A₂[σ_E, X→{X:C}] <: A₂[σ_E, X→{X:E_X}], we’d be done. But this is not necessarily true — A₂ is not substitution-invariant in X (we never argued it is).

Observation: We need this for all E_X, including E_X = Bot (if we have Bot) or other small types. If A₂ has covariant X, then A₂[X→{X:C}] has {X:C} covariantly, which is generally not <: A₂[X→{X:E_X}] for E_X smaller than C.

But wait — we want B[σ_F'] <: A₂[σ_E, X→{X:E_X}]. We don’t necessarily go through A₂[σ_E, X→{X:C}]. Maybe there’s a more direct argument.

Let me try the other direction. Use the symmetric structure.

Trying the other side

We have (P1) giving us A₁[σ_E, X→{X:C}] <: B[σ_F, X→{X:D}] and the reverse, for arbitrary σ_E, σ_F over S.

We don’t have an analog of (P1) for A₂ — that’s the asymmetry.

What we do have: the original syntactic premise also includes A[σ_E, X→{X:C}] <: B[σ_F, X→{X:D}], which in the case Â₁ <: B̂ (the case we’re in) gives us nothing new about A₂.

We also have B[σ_F, X→{X:D}] <: A[σ_E, X→{X:C}], which expands to both B... <: A₁... and B... <: A₂... — that’s (P2), no help for the gap.

So we genuinely don’t have <: between A₂ and B in both directions, and we can’t bootstrap A₂ ‘s substitution-invariance via Lemma 1.

The honest situation

I now think the lemma as stated (with the simple two-branch disjunction) may not be provable purely semantically, even with Lemma 1. The proof obstruction is real:

We can vary B ‘s side freely (Lemma 1).
We can’t vary A₂ ‘s side freely without an additional structural argument.
The structural argument requires looking at the derivation B <: A₂ — i.e., reaching into the syntactic system.

This means the proof necessarily mixes semantic and syntactic reasoning. The inversion lemma’s premise should perhaps be syntactic, the conclusion semantic, and the proof uses syntactic case-analysis at the critical step.

Concrete proposal for the proof structure

Let me lay out a more honest proof structure:

Theorem (Inversion). If A[X→{X:C}] ~ B[X→{X:D}] \ S syntactically, then either:

C ~ D semantically, or

A ~ B \ S ∪ {X} semantically.

Proof structure:

Induct on the size of A + B.

In each case, do syntactic case-analysis on the derivation of A[X→{X:C}] ~ B[X→{X:D}] \ S (this is where syntactic gives us hooks the semantic version lacks).

Stuck case: A = A₁ & A₂, B ordinary. The derivation of the equivalence involves derivations:

A₁[X→{X:C}] <: B[X→{X:D}] \ S (by case assumption that branch 1 wins on the <: direction; the symmetric case is similar).
B[X→{X:D}] <: A₁[X→{X:C}] \ S.
B[X→{X:D}] <: A₂[X→{X:C}] \ S.

By IH on (A₁, B) (whose derivation is syntactically given by the first two): either C ~ D (done), or A₁ ~ B \ S ∪ {X} semantically.

In the second case:

By Lemma 1 applied to (*): B is semantically substitution-invariant in X (and over S).
Use this together with the syntactic derivation B[X→{X:D}] <: A₂[X→{X:C}] \ S to conclude B <: A₂ \ S ∪ {X} semantically.

Step 2 is the new lemma we need. Let me state it carefully:

Lemma 2 (revised, mixed syntactic/semantic). Suppose:

B' is semantically substitution-invariant in X over S (i.e., ∀σ_F over S, F, F'. B'[σ_F, X→{X:F}] ~ B'[σ_F, X→{X:F'}] semantically).

There is a syntactic derivation of B'[X→{X:D}] <: T[X→{X:C}] \ S for specific C, D.

Then B' <: T \ S ∪ {X} semantically (i.e., ∀σ_E', σ_F' over S∪{X}. B'[σ_F'] <: T[σ_E']).

The hypothesis combines a semantic invariance property of B' with a syntactic derivation. The conclusion is semantic.

The proof goes by induction on the syntactic derivation. At each rule, case-analyze:

Trivial cases (Top, intersection elimination dropping X -bearing branch on the right of <:, etc.) lift directly.
The “label vs label” rule: would force comparing B' ‘s X -position label with T ‘s X -position label, giving D <: C constraint. If this case arises, we need to show it actually means B' has X actively in its structure, contradicting B' ‘s semantic substitution-invariance.

Ruling out the bad case: the derivation reaches a label-comparison {X:R} <: {X:C} where {X:R} came from B' ‘s structure (from substituting X→{X:D} and possibly other S -vars). If R = D (the substituent), this means B' had X (as a variable) in this position structurally. Then varying D in B' ‘s substituent would change this comparison, breaking the comparison’s success. By soundness (the syntactic derivation lifts to semantic), B'[X→{X:F}] <: T[X→{X:C}] would fail for F with F ≮: C, contradicting B' ‘s substitution-invariance forcing it to behave the same way under varying F.

Let me try to formalize this contradiction more carefully.

If the derivation requires {X:D} <: {X:C} (i.e., D <: C) at some step, then by the same derivation skeleton with D replaced by D', we’d require D' <: C. For arbitrary D', this doesn’t hold. So B'[X→{X:D'}] <: T[X→{X:C}] would fail (using a structurally identical derivation that just gets stuck at this step).

But maybe a different derivation succeeds for D'? Possibly. Subtyping derivations aren’t unique.

So the contradiction is more subtle: we’d need to show that if the derivation requires D <: C actively, then no derivation succeeds for some choice of D'. This is a completeness-style argument — for the comparison to succeed, some derivation must work, and if all derivations require the same active label comparison, none works for generic D'.

Hmm, this is getting tangled. Let me think about whether there’s a cleaner formulation.

A more direct approach — define syntactic absorption

Actually, I think the cleanest path is to define a syntactic notion of absorption and prove things in two steps.

Definition (Syntactic absorption). A type T is X -absorbed (relative to a context, contravariant/covariant polarity) inductively:

Top is X -absorbed (any polarity).

T₁ & T₂ is X -absorbed iff… (this is the tricky part — needs polarity-dependent rules).

T₁ → T₂ is X -absorbed iff T₁ is X -absorbed contravariantly and T₂ is X -absorbed covariantly.

{X : R} is never X -absorbed in covariant position; it is X -absorbed in contravariant position only if it appears under a Top.

Y (variable, Y ≠ X) is X -absorbed.

X (variable) — but wait, we substitute X → {X:_}, so X as a variable shouldn’t appear in B[X→{X:D}]. So we don’t need this case.

The intersection case is genuinely tricky because absorption can come from “the other branch handles it” — but for T₁ & T₂ on the right (covariant context), both branches must be checked, so absorption requires both. For left (contravariant context), only one branch needs to handle it, so absorption can come from one.

Then prove:

B is syntactic- X -absorbed iff B is semantically substitution-invariant in X (in some appropriate sense).
If B is syntactic- X -absorbed and there’s a syntactic derivation B[X→{X:D}] <: T[X→{X:C}] \ S, then T is also syntactic- X -absorbed in the relevant positions.
Therefore the derivation lifts to arbitrary substituents.

This is more work, but it makes the absorption property a first-class syntactic notion that we can reason about explicitly.

My honest revised position

The picture I now have:

Lemma 1 (semantic substitution-invariance via transitivity): OK, free, semantic-only.
The hard step: lifting the asymmetric B <: A₂ to arbitrary X -substituents. This requires either:
- A semantic argument that’s clever in some way I’m not seeing, OR
  - A mixed argument that uses the syntactic derivation of B <: A₂ to extract structural information about A₂, OR
  - A definition of syntactic absorption and a careful syntactic-side argument.

I lean toward the third option as the cleanest, but it requires real work to set up.

To answer your direct questions

In particular, how do we get the fact that B ‘s X -occurrences in B[σ_F'] are in absorbed positions?

From (*) (the semantic equivalence-under-avoidance for (A₁, B)) + Lemma 1, we get semantic substitution-invariance of B. To translate this into a syntactic absorption fact, we’d need a completeness-direction theorem: if B is semantically substitution-invariant, then B is syntactically X -absorbed.

Such a theorem is what justifies passing from the semantic invariance to a structural fact about derivations involving B.

Is the absorption going to be a syntactic definition or semantic spec?

I now think it should be both, with a soundness/completeness theorem connecting them:

Syntactic absorption: an inductive predicate on types, polarity-aware.
Semantic substitution-invariance: the universal-equivalence-under-substitution property.
They coincide for “well-formed” types in some appropriate sense.

The inversion lemma proof would then proceed:

Get semantic substitution-invariance of B from (*) + Lemma 1.
Convert to syntactic absorption via the completeness theorem.
Use syntactic absorption + the syntactic derivation B <: A₂ to lift the X -substituent on A₂ ‘s side (this is where the derivation-induction happens, with absorption as a structural side condition).
Conclude semantically.

This is more complex than I’d like, but I think it’s the honest structure. Pure semantic reasoning isn’t enough; pure syntactic reasoning loses the cleanest specification; the mix gives both.

Caveats I want to flag

I haven’t fully verified that the syntactic absorption definition + soundness/completeness pair actually works. The intersection case in the absorption definition is delicate, and getting it right requires care.
The completeness direction (semantic invariance → syntactic absorption) might fail in subtle ways. For instance, a type might be semantically substitution-invariant by a “lucky coincidence” of intersection branches, without each branch being individually absorbed. Whether the syntactic predicate captures all such cases is unclear without working it out.
There may be a cleaner formulation I’m missing. The fact that the proof is getting this complicated suggests either the lemma is genuinely hard, or there’s a perspective shift that simplifies things.

So my recommendation: try to formalize the syntactic absorption definition, see if the soundness/completeness with semantic invariance works out, and use it in the inversion proof. If that bogs down, it might be worth stepping back and asking whether the lemma’s statement should be different (e.g., conclusion stated syntactically instead of semantically, with a separate theorem connecting syntactic and semantic avoidance).

Does this match your thinking, or do you see a cleaner path?

You

Can you step back and think if the fact that I made the conclusion to be semantically avoiding is necessary? Can’t we just use the syntactic conclusion and make use of the transitivity (as you have noted) property to reach what we have gotten so far?

Claude

Stepping Back: Reconsidering the Statement

This is a really good question to push on. Let me think carefully about what’s actually needed.

The shape we’d want

(SynInv-Multi). If A[X→{X:C}] ~ B[X→{X:D}] \ S syntactically, then either:

C ~ D, or

A ~ B \ S ∪ {X} syntactically.

Both premise and conclusion are syntactic. The induction is straightforwardly on derivations. Case-analysis on intersection branches is available. Substitution-invariance of B (the “Lemma 1” property) — does it hold syntactically?

Does Lemma 1 hold syntactically?

Lemma 1 (semantic version): If A[θ_E] ~ B[θ_F] semantically for all θ_E, θ_F, then B[θ_F] ~ B[θ_F'] semantically for all θ_F, θ_F'.

The proof was pure transitivity through A.

The syntactic analog would be: If A ~ B \ S syntactically (universal-substitution syntactically), then B ~ B \ S syntactically.

Wait — B ~ B \ S syntactically is what reflexivity-modulo-avoidance gives, and you noted earlier that reflexivity holds when free variables are disjoint from S, but might fail otherwise.

So B ~ B \ S syntactically is not automatic. Hmm. But the transitivity argument should still work: from A ~ B \ S syntactically and A ~ B \ S syntactically (same fact, twice), use transitivity of \ S to conclude B ~ B \ S syntactically.

Does syntactic \ S admit transitivity? This is a key question. Syntactic subtyping with avoidance: if X <: Y \ S and Y <: Z \ S are syntactically derivable, is X <: Z \ S syntactically derivable?

For standard subtyping (no avoidance), transitivity is typically a derived property (not an explicit rule), proven by induction on derivations. For avoidance subtyping, it should still hold but needs to be checked — the “no-go” rules for S -variables shouldn’t break transitivity, since they only restrict the reflexivity rule.

Let me assume transitivity holds syntactically for \ S (this is reasonable and you’d need it anyway for any sensible subtyping system). Then:

Lemma 1, syntactic version: From A ~ B \ S syntactic, derive B ~ A \ S syntactic (by symmetry — ~ is mutual <:), then by transitivity B ~ B \ S syntactic.

Wait, that’s just B ~ B \ S. That’s a fixed-point statement, not a substitution-invariance statement.

I think I was conflating things. Let me redo.

Re-examining what Lemma 1 actually says

The semantic Lemma 1 said: ” B ‘s equivalence class is invariant under varying X ‘s substituent.”

In the syntactic formulation, the analog would be: ” B[X→{X:F}] ~ B[X→{X:F'}] \ S syntactically, for all F, F'.”

But this isn’t a single statement — it’s a family of statements parameterized by F, F'. Each instance is a syntactic claim, but the universal quantification “for all F, F' ” is meta-level, not internalized in the syntactic system.

So when we say “Lemma 1 syntactically,” we mean: for any choice of F, F', we can derive B[X→{X:F}] ~ B[X→{X:F'}] \ S syntactically.

How do we get this? From A[X→{X:E}] ~ B[X→{X:F}] \ S syntactically (the IH), for any E, F:

Take E := E* fixed, F := F: gives A[X→{X:E*}] ~ B[X→{X:F}] \ S.
Take E := E* fixed, F := F': gives A[X→{X:E*}] ~ B[X→{X:F'}] \ S.
By symmetry of ~: B[X→{X:F}] ~ A[X→{X:E*}] \ S and A[X→{X:E*}] ~ B[X→{X:F'}] \ S.
By transitivity: B[X→{X:F}] ~ B[X→{X:F'}] \ S.

So we need IH to give us a family of syntactic claims (one per E, F), and we use them with concrete instantiations.

But wait — what does the IH A ~ B \ S ∪ {X} syntactically actually mean? If \ S ∪ {X} is the syntactic avoidance system, then the IH gives us one syntactic claim, not a family. The family interpretation is what soundness gives via the semantic spec.

Hmm. So the syntactic claim A ~ B \ S ∪ {X} doesn’t directly give us “for all F, F', A[X→{X:E}] ~ B[X→{X:F}] \ S syntactically.” It gives us a single derivation in the avoidance system.

To get the family, we’d need a substitution lemma for the syntactic avoidance system:

Syntactic Substitution Lemma: If A ~ B \ S ∪ {X} syntactically, then for any E, F, A[X→{X:E}] ~ B[X→{X:F}] \ S syntactically.

This is a soundness-style theorem: the syntactic avoidance system, when we strip off avoidance for X, behaves correctly under substitution.

If this substitution lemma holds, then we can chain:

IH gives A₁ ~ B \ S ∪ {X} syntactic.
Substitution lemma gives, for any E, F, A₁[X→{X:E}] ~ B[X→{X:F}] \ S syntactic.
By transitivity (chaining through A₁ with two instantiations), B[X→{X:F}] ~ B[X→{X:F'}] \ S syntactic for any F, F'.

Great — this is Lemma 1 syntactically.

Does the syntactic substitution lemma hold?

Is it the case that A ~ B \ S ∪ {X} syntactic implies A[X→{X:E}] ~ B[X→{X:F}] \ S syntactic, for any E, F?

This is exactly the soundness of the syntactic avoidance system w.r.t. its semantic spec, in one direction. The semantic spec is “for all E, F, the substituted comparison holds.” If syntactic implies semantic, then yes.

Soundness should be provable by induction on the syntactic derivation: each rule of \ S ∪ {X} should preserve the property under substitution. The Top -absorption rule: T <: Top \ S ∪ {X} syntactically gives T[X→{X:E}] <: Top \ S for any E — clearly true (anything is <: Top). The intersection elimination rule: similar. The label rule, the variable rule (with X ∉ S ∪ {X} blocked, so reflexivity on X doesn’t fire) — all should work.

So soundness should hold, giving us the syntactic substitution lemma.

Now, the full syntactic proof attempt

Let me redo the stuck case purely syntactically, with everything in hand.

Premise (syntactic, specific C, D):

(P1) A₁[X→{X:C}] <: B[X→{X:D}] \ S syntactic.
(P2) B[X→{X:D}] <: A₁[X→{X:C}] \ S syntactic.
(P3) B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic.

(I’m in the case where the <: direction of the equivalence used branch 1.)

IH on (A₁, B):

Either C ~ D, or A₁ ~ B \ S ∪ {X} syntactic.

Take the second case. We want to conclude (A₁ & A₂) ~ B \ S ∪ {X} syntactic.

<: direction: A₁ & A₂ <: B \ S ∪ {X} syntactic. Use intersection elimination on the left (pick A₁), reduce to A₁ <: B \ S ∪ {X} — which we have.

:> direction: B <: A₁ & A₂ \ S ∪ {X} syntactic. Use intersection introduction on the right, reduce to B <: A₁ \ S ∪ {X} (have) and B <: A₂ \ S ∪ {X} (need).

To derive B <: A₂ \ S ∪ {X} syntactically, we need a syntactic derivation. What we have:

(P3) B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic — a derivation about specific substituted instances.
IH gives A₁ ~ B \ S ∪ {X} syntactic — a derivation about A₁ and B with X in the avoidance set.
Lemma 1 syntactic (if we can derive it from IH + substitution lemma + transitivity): B[X→{X:F}] ~ B[X→{X:F'}] \ S syntactic for any F, F'.

But what we need is B <: A₂ \ S ∪ {X} — a derivation in the avoidance system about B and A₂, with X in the avoidance set, without any concrete substitution applied.

Can we construct this derivation?

We have a syntactic derivation tree for (P3) involving B[X→{X:D}] and A₂[X→{X:C}]. We want to “abstract” it back into a derivation about B and A₂ in the avoidance system.

This abstraction is not in general going to work, because the derivation (P3) might use facts about D and C specifically (e.g., it might reduce {X:D} <: {X:C} via the label rule, requiring D <: C, which is a fact about D, C, not about B, A₂).

So we cannot blindly abstract. We need to know that the derivation (P3) can be transformed into one that doesn’t depend on D, C — specifically, that the X -occurrences in (P3) appear only in absorbed positions in the derivation, so they can be replaced by avoidance.

This is exactly the absorption argument again, just now in a syntactic-to-syntactic transformation rather than a syntactic-to-semantic lifting.

So even purely syntactically, we need the absorption lemma. The work hasn’t gone away — it’s just been moved.

What syntactic conclusion buys us, and what it doesn’t

What syntactic conclusion (instead of semantic) buys us:

✓ The IH gives a single derivation, easier to manipulate than a universal property. ✓ Case-analysis on derivations is available throughout. ✓ Transitivity and other syntactic properties chain cleanly. ✓ Lemma 1 (substitution-invariance) is derivable from IH + soundness substitution lemma + transitivity.

What it doesn’t buy:

✗ The absorption argument is still needed to lift (P3) into an avoidance-system derivation about B and A₂. ✗ The proof of “absorption” still requires looking at B ‘s structure (or the derivation involving B) and arguing that B ‘s X -occurrences are inactive.

So is the syntactic statement an improvement?

Yes, modestly, in two ways:

Simpler bookkeeping. Working with derivations end-to-end (no semantic interlude) is more uniform.
Lemma 1 becomes a clean syntactic chain using the substitution lemma + transitivity, rather than a transitivity-on-semantic-equivalences argument.

But:

The core obstacle — lifting (P3) — remains. It requires either a syntactic substitution-invariance-of- B argument that propagates through the derivation, or an explicit absorption analysis.

A cleaner phrasing of what’s needed

Stripped of unnecessary semantic detour, the heart of the proof is:

Lemma 2’ (purely syntactic). Suppose:

B ~ B' \ S ∪ {X} syntactic, where B' = B (i.e., B ‘s avoidance-equivalence to itself with X avoided — this captures syntactic substitution-invariance of B w.r.t. X).

Wait, this doesn’t quite parse — B ~ B \ S ∪ {X} is reflexivity-modulo-avoidance, which holds iff B ‘s free variables are disjoint from S ∪ {X} (i.e., X doesn’t appear in B). But X does appear in B in the cases we care about — that’s the whole point.

Hmm. Let me reformulate.

The right syntactic version of ” B ‘s equivalence class is substitution-invariant in X ” is:

B ~ B \ S ∪ {X} syntactic — but interpreted as “the avoidance-aware system can derive B ‘s self-equivalence even with X avoided,” which permits X to occur in B so long as it occurs in absorbed positions.

So B ~ B \ S ∪ {X} syntactic is the right formal statement: it says the avoidance-aware system can prove B ‘s equivalence with itself, despite X appearing in B, because X only appears in absorbed positions. This is exactly the syntactic-absorption property.

And the way we obtain B ~ B \ S ∪ {X} syntactic in the proof is:

IH gives A₁ ~ B \ S ∪ {X} syntactic.
Symmetry: B ~ A₁ \ S ∪ {X} syntactic.
Transitivity: B ~ B \ S ∪ {X} syntactic.

✓ That’s clean.

Now Lemma 2’ becomes:

Lemma 2’ (syntactic). Suppose:

B ~ B \ S ∪ {X} syntactic. (Syntactic substitution-invariance of B w.r.t. X.)

B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic for some specific C, D.

Then B <: A₂ \ S ∪ {X} syntactic.

This is the core lemma whose proof requires the structural absorption argument.

My answer to your question

Yes, making the conclusion syntactic is a simplification. It cleans up Lemma 1, removes the soundness detour, and unifies the proof in one system. Specifically:

Lemma 1 (substitution-invariance) becomes derivable from IH + transitivity + symmetry, all syntactic.
The output of Lemma 1 is a clean syntactic statement: B ~ B \ S ∪ {X} syntactic.
The remaining work is Lemma 2’ (lifting (P3) into the avoidance system), which is genuinely needed regardless of semantic vs. syntactic conclusion.

However, the hard step (Lemma 2’) is still hard. It’s the same absorption argument, just expressed syntactically. The proof of Lemma 2’ is by induction on the derivation of (P3), using B ~ B \ S ∪ {X} syntactic as a structural hypothesis to constrain the derivation.

So switching to a syntactic conclusion is a real simplification but doesn’t avoid the core difficulty. The core difficulty is genuinely about how X -occurrences in A₂ interact with the derivation, and no choice of conclusion-flavor (syntactic vs semantic) makes that go away.

Concrete suggestion

Try the syntactic version first — the bookkeeping is cleaner. Specifically:

Define \ S syntactically with the rules you have in mind (Top -absorption, intersection elimination, blocked variable reflexivity for S, etc.).
Prove that the syntactic system has the structural properties: transitivity, symmetry, soundness w.r.t. semantic spec, substitution lemma.
State the inversion lemma with both premise and conclusion syntactic.
Prove Lemma 1 (B ~ B \ S ∪ {X} syntactic from IH + transitivity).
Prove Lemma 2’ by induction on the derivation of (P3), using Lemma 1’s conclusion as a structural hypothesis. This is where the absorption analysis happens and where the case-analysis-on-derivations buys you the most.

If Lemma 2’ goes through, the whole proof closes. If it doesn’t go through, you’ll have a precise stuck case that tells you what additional rule the avoidance system needs — and that stuck case will be tractable to analyze, because everything is syntactic.

This is the route I’d recommend pursuing now. Want to try sketching Lemma 2’ more carefully and seeing where it gets stuck?

You

can you try to sketch lemma 2’? Here is my A <: B / S rule for your reference:

Inductive ssub_disjoint: typ → typ → atoms → Prop:= | ssub_d_nat: forall D, ssub_disjoint typ_nat typ_nat D | ssub_d_top: forall A D, lc_typ A → ssub_disjoint A typ_top D | ssub_d_var: forall X D, X notin D → ssub_disjoint (typ_fvar X) (typ_fvar X) D | ssub_d_arrow: forall A1 A2 B1 B2 D, ssub_disjoint B1 A1 D → ssub_disjoint A2 B2 D → ssub_disjoint (typ_arrow A1 A2) (typ_arrow B1 B2) D | ssub_d_and: forall A B1 B2 D, ssub_disjoint A B1 D → ssub_disjoint A B2 D → ssub_disjoint A (typ_and B1 B2) D | ssub_d_andL: forall A1 A2 B D, ssub_disjoint A1 B D → lc_typ A2 → ord B → ssub_disjoint (typ_and A1 A2) B D | ssub_d_andR: forall A1 A2 B D, ssub_disjoint A2 B D → lc_typ A1 → ord B → ssub_disjoint (typ_and A1 A2) B D | ssub_d_rcd: forall A B l D, ssub_disjoint A B D → ssub_disjoint (typ_rcd l A) (typ_rcd l B) D

Claude

Sketching Lemma 2’

Thanks for the rules — they’re very helpful. Let me first orient myself to the system, then sketch Lemma 2’.

Orienting to the rules

A few observations about your ssub_disjoint:

No reflexivity rule for variables in D. ssub_d_var requires X ∉ D. Good — this matches the avoidance semantics.
Top absorbs everything via ssub_d_top, with no recursion into the LHS. So T <: Top \ D for any T regardless of free variables in T. ✓
Intersection elimination on the left (ssub_d_andL, ssub_d_andR) requires the RHS to be ordinary. ✓ This matches your earlier description of “ordinary” types.
Intersection introduction on the right (ssub_d_and) requires both branches to be subtyped.
Records (labels) typ_rcd l A are compared by their bodies, with the label name l being a syntactic atom (not in D). The label name l is just structural — it must match exactly between LHS and RHS.

One thing to note: I don’t see a transitivity rule explicitly. I’ll assume it’s a derived property (provable as a meta-lemma).

A second thing: ssub_d_top requires lc_typ A (locally closed) but no constraint that A ‘s free variables avoid D. So Top -absorption truly absorbs everything. ✓

A third thing: there’s no explicit “subsumption” rule combining <: <: into a single <:. Subtyping is structural here, and transitivity is an admissibility property.

Setup for Lemma 2’

In the inversion lemma’s stuck case, I had:

B ~ B \ S ∪ {X} syntactic (from IH + transitivity).
B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic (from premise).
Want: B <: A₂ \ S ∪ {X} syntactic.

But before sketching the proof, I want to flag a question about the Lemma 2’ formulation itself. Let me think through it carefully.

A subtle point about substitution and X

The premise B[X→{X:D}] <: A₂[X→{X:C}] \ S involves substituting X (as a variable) with {X:D} (a record/label). The substitution operation here is on type variables — X as typ_fvar X becomes typ_rcd X D.

The conclusion B <: A₂ \ S ∪ {X} involves the unsubstituted B and A₂, where X may appear as typ_fvar X.

In the avoidance system, X as a free variable in B or A₂ can appear:

In a Top -position (absorbed via ssub_d_top).
In a dropped intersection branch (via ssub_d_andL or ssub_d_andR).
Symmetrically, but not via ssub_d_var (because X ∈ D = S ∪ {X}).

The premise (with X substituted to {X:D}) treats X -occurrences as records, which match via ssub_d_rcd and recurse into bodies. The body D (a type) and C (a type) are then compared.

For the conclusion to follow from the premise, we need to “un-substitute” — i.e., transform a derivation involving records {X:D}, {X:C} into a derivation involving variables X in absorbed positions.

This won’t always work. If the premise’s derivation uses ssub_d_rcd to compare {X:D} and {X:C} actively (recursing into D <: C), this comparison is fine in the substituted system but doesn’t lift to the avoidance system (where X as a variable can’t be reflexively compared with X ∈ D).

So Lemma 2’ is conditional on D, C not being actively compared in the derivation — i.e., on the absorption property.

Lemma 2’ sketch

Lemma 2’. Suppose:

B ~ B \ S ∪ {X} syntactic — i.e., both B <: B \ S ∪ {X} and B <: B \ S ∪ {X} (which is the same — call it the self-equivalence property w.r.t. X).

B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic.

Then B <: A₂ \ S ∪ {X} syntactic.

Proof sketch by simultaneous induction on the derivations of (1) and (2), primarily on (2).

Wait, let me rethink. Hypothesis (1) is B <: B \ S ∪ {X}. This is a derivation showing B <: B even though X may appear in B and X is in the avoidance set. The derivation cannot use ssub_d_var for X (forbidden); it must use ssub_d_top or intersection elimination at every position where X would otherwise be reflexively compared.

So (1) tells us, structurally, that B is “absorbed-compatible w.r.t. X ” — every X -occurrence in B is in a position the avoidance system can handle without using X -reflexivity.

Now I want to use this to transform (2) into a derivation of B <: A₂ \ S ∪ {X}.

Approach: induct on the derivation of (2), case-analyzing the last rule.

For each rule, I’ll consider how to lift it.

Case ssub_d_nat: B[X→{X:D}] = nat, A₂[X→{X:C}] = nat. Then B = nat (no X), A₂ = nat (no X). Conclusion: nat <: nat \ S ∪ {X} by ssub_d_nat. ✓

Case ssub_d_top: A₂[X→{X:C}] = Top. Then A₂ = Top (substitution doesn’t introduce Top). Conclusion: B <: Top \ S ∪ {X} by ssub_d_top — needs lc_typ B, which we have. ✓

This is a key absorption case. Even if B has lots of X ‘s, we hit Top on the right and discharge by ssub_d_top. ✓

Case ssub_d_var: B[X→{X:D}] = typ_fvar Y, A₂[X→{X:C}] = typ_fvar Y, with Y ∉ S.

Here Y is a variable, and substitution preserves variables — so B = typ_fvar Y' and A₂ = typ_fvar Y'' such that Y'[X→{X:D}] = typ_fvar Y and Y''[X→{X:C}] = typ_fvar Y.

Substitution of X for {X:D} in a variable: if the variable is X, becomes {X:D} (a record, not a variable!). If the variable is Y ≠ X, stays as typ_fvar Y.

So if B[X→{X:D}] = typ_fvar Y, then B = typ_fvar Y with Y ≠ X (otherwise B[X→{X:D}] would be {X:D}, not a variable). Similarly A₂ = typ_fvar Y with Y ≠ X.

Conclusion: need typ_fvar Y <: typ_fvar Y \ S ∪ {X} by ssub_d_var. Requires Y ∉ S ∪ {X}. We know Y ∉ S (from premise). And Y ≠ X, so Y ∉ {X}. So Y ∉ S ∪ {X} ✓.

✓ This case lifts cleanly.

Case ssub_d_arrow: B[X→{X:D}] = A_dom -> A_cod, A₂[X→{X:C}] = B_dom -> B_cod.

So B = B_dom_orig -> B_cod_orig and A₂ = A_dom_orig -> A_cod_orig with the components substituted appropriately.

Subgoals:

B_dom <: A_dom \ S (which is A₂_dom_orig[X→{X:C}] <: B_cod_orig[X→{X:D}] \ S — wait let me re-examine the direction).

Actually let me reparse. Rule: ssub_d_arrow A1 A2 B1 B2 D gives (A1 -> A2) <: (B1 -> B2) \ D from B1 <: A1 \ D (contravariant domain) and A2 <: B2 \ D (covariant codomain).

So in our case: B[X→{X:D}] = (B_pre)[X→{X:D}] = (B_dom_orig)[X→{X:D}] -> (B_cod_orig)[X→{X:D}]. And A₂[X→{X:C}] = (A₂_dom_orig)[X→{X:C}] -> (A₂_cod_orig)[X→{X:C}].

Sub-derivations:

(A₂_dom_orig)[X→{X:C}] <: (B_dom_orig)[X→{X:D}] \ S — but wait, this is the contravariant direction, so it should be A₂_dom <: B_dom \ S (yes, but with substitutions appropriately).

Hmm, but here the LHS has {X:C} substituted (from A₂_dom_orig) and the RHS has {X:D} substituted (from B_dom_orig). The form is not Lhs[X→{X:D}] <: Rhs[X→{X:C}] \ S — the substituents are swapped relative to the outer comparison!

This is a critical observation I’d glossed over. Let me think about it.

In the outer derivation: LHS gets {X:D}, RHS gets {X:C}. This is the “specific premise” of the inversion lemma.

In the contravariant arrow position: LHS becomes A₂_dom_orig[X→{X:C}] (RHS of the inner subtyping but LHS of the outer — substituted with C), and RHS becomes B_dom_orig[X→{X:D}] (LHS of the outer but RHS of the inner — substituted with D).

So the substituents swap when crossing a contravariant boundary. The inner subtyping has form <RHS_substituted_with_C> <: <LHS_substituted_with_D> \ S — which is the opposite substitution pattern.

This is concerning. The IH for Lemma 2’ (on a sub-derivation) wants the form B'[X→{X:D}] <: A₂'[X→{X:C}] \ S (matching the outer form). But the contravariant sub-derivation has A₂'_dom[X→{X:C}] <: B'_dom[X→{X:D}] \ S — substituents swapped.

The IH doesn’t directly apply to the swapped form. We’d need a symmetric Lemma 2’ or to handle both substitution patterns.

Reformulating Lemma 2’ to handle both directions

I think Lemma 2’ needs to be symmetric in (C, D) or be stated in a more direction-neutral way. Let me try:

Lemma 2’ (revised). Suppose:

B ~ B \ S ∪ {X} syntactic.

A₂ ~ A₂ \ S ∪ {X} syntactic. (New: also assume A₂ is absorption-compatible.)

B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic for some specific C, D.

Then B <: A₂ \ S ∪ {X} syntactic.

But hypothesis (2) is a strong addition. We don’t have A₂ ~ A₂ \ S ∪ {X} from the inversion lemma’s setup — only B ~ B \ S ∪ {X} from the IH.

Hmm. So this revision doesn’t help directly. Let me think differently.

Alternative: Maybe Lemma 2’ should generalize to assert something for both LHS and RHS subtypes:

Lemma 2” (revised again). Suppose:

B ~ B \ S ∪ {X} syntactic.

B[X→{X:D}] <: T[X→{X:C}] \ S syntactic.

Then for any E, F in some class of “safe” types: B[X→{X:F}] <: T[X→{X:E}] \ S syntactic.

This is more like the original semantic spec. But proving it has the same case-analysis issue.

Stepping back: the swap problem is real

The contravariance swap is a real obstacle that I hadn’t fully appreciated. Let me think about what it means.

In the semantic spec, both E and F quantify universally and independently, so swapping is fine — universal ∀E, F. ... is symmetric in (E, F). But in the syntactic system with specific (C, D), the swap takes us to a different specific instance (D, C), which is not the same derivation.

However, hypothesis (1) — B ‘s syntactic self-equivalence — is symmetric: B ~ B \ S ∪ {X} doesn’t have a directional bias. So whatever absorption argument applies on the LHS side (in <:) also applies on the RHS side (in :> of the same equivalence).

I think the right move is:

Lemma 2’ (final attempt). Suppose:

B ~ B \ S ∪ {X} syntactic.

B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic.

A₂[X→{X:C}] <: A₂[X→{X:D}] \ S syntactic, OR equivalently for some structural reason A₂ ‘s X -occurrences match what’s possible.

Then B <: A₂ \ S ∪ {X} syntactic.

This is getting muddled. Let me back off and approach the problem differently.

A different angle: induct on (1), not (2)

Hypothesis (1) B ~ B \ S ∪ {X} is structurally about B. If we induct on the derivation of (1), we walk through B ‘s structure, knowing at each position how X -occurrences in B are handled by the avoidance system.

But I’m not sure how to use this to construct a derivation of B <: A₂ \ S ∪ {X} — because that derivation walks through both B and A₂, not just B.

Maybe induct on A₂ ‘s structure instead?

Case A₂ = Top. Conclusion: B <: Top \ S ∪ {X} by ssub_d_top. ✓

Case A₂ = nat. Then A₂[X→{X:C}] = nat. From hypothesis (2): B[X→{X:D}] <: nat \ S syntactic. By inversion of subtyping (the only way to derive _ <: nat), B[X→{X:D}] = nat, so B = nat. Conclusion: nat <: nat \ S ∪ {X} by ssub_d_nat. ✓

Case A₂ = typ_fvar Y with Y ≠ X. Then A₂[X→{X:C}] = Y. From hypothesis (2): B[X→{X:D}] <: Y \ S. By inversion, B[X→{X:D}] = Y (the only way to subtype a variable other than Top -on-right which doesn’t apply). So B = Y (since Y ≠ X, no substitution). Conclusion: Y <: Y \ S ∪ {X} by ssub_d_var (need Y ∉ S ∪ {X}, which holds since Y ∉ S from the premise’s avoidance and Y ≠ X). ✓

Case A₂ = typ_fvar X. Then A₂[X→{X:C}] = {X:C} (a record). From hypothesis (2): B[X→{X:D}] <: {X:C} \ S. By inversion, B[X→{X:D}] = {X:R} with R <: C \ S, OR B[X→{X:D}] = T₁ & T₂ with one branch being a record. Let me focus on the simple case: B[X→{X:D}] = {X:R}. So B = X (and substitution makes it {X:D}, with R = D), or B = {X:R'} with R'[X→{X:D}] = R (but B ‘s label name is the constant X… wait, here we said X doesn’t appear free as a label, only as a variable).

So if B[X→{X:D}] = {X:R}, the only way is B = X (the variable, which substitutes to {X:D} with R = D). Then we need B <: A₂ \ S ∪ {X} = X <: X \ S ∪ {X} — but X ∈ S ∪ {X}, so ssub_d_var is blocked! This derivation fails.

But hypothesis (1) B ~ B \ S ∪ {X} would also fail in this case (since B = X requires X <: X \ S ∪ {X}, blocked).

So hypothesis (1) rules out this case. Good — it’s consistent.

Case A₂ = A₂_dom -> A₂_cod. Then A₂[X→{X:C}] = (A₂_dom[X→{X:C}]) -> (A₂_cod[X→{X:C}]). From hypothesis (2), B[X→{X:D}] <: (A₂_dom[X→{X:C}]) -> (A₂_cod[X→{X:C}]) \ S.

Inversion: either the RHS is Top (no), or B[X→{X:D}] is an arrow (the simple case) or an intersection where one branch is an arrow.

Simple case: B[X→{X:D}] = B_dom -> B_cod, so B = B_dom_orig -> B_cod_orig. Sub-derivations:

A₂_dom[X→{X:C}] <: B_dom \ S (contravariant)
B_cod <: A₂_cod[X→{X:C}] \ S (covariant)

But B_dom = B_dom_orig[X→{X:D}] and B_cod = B_cod_orig[X→{X:D}].

Contravariant sub-derivation: A₂_dom[X→{X:C}] <: B_dom_orig[X→{X:D}] \ S. This has form LHS[C-substituted] <: RHS[D-substituted] — substituents swapped relative to the outer.

To apply IH (recursing on A₂_dom), I’d need a Lemma 2’ that handles the swapped substituent form.

So I really do need either:

A more general Lemma 2’ that handles both substitution patterns symmetrically, or
A pair of mutually-recursive lemmas (one for each substitution direction).

Let me try the pair-of-lemmas approach.

Mutual recursion: Lemma 2’a and 2’b

Lemma 2’a. Suppose B ~ B \ S ∪ {X} syntactic, A₂ ~ A₂ \ S ∪ {X} syntactic, and B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic. Then B <: A₂ \ S ∪ {X} syntactic.

Lemma 2’b. Suppose B ~ B \ S ∪ {X} syntactic, A₂ ~ A₂ \ S ∪ {X} syntactic, and B[X→{X:C}] <: A₂[X→{X:D}] \ S syntactic. (Substituents swapped.) Then B <: A₂ \ S ∪ {X} syntactic.

The two lemmas mutually recurse: in the arrow case, the contravariant sub-derivation uses the swapped form, calling the other lemma.

But wait, I added the hypothesis A₂ ~ A₂ \ S ∪ {X} syntactic — and we don’t have that from the inversion lemma.

Hmm. In the inversion lemma’s stuck case, we have B ‘s self-equivalence (from IH on (A₁, B)). We don’t have A₂ ‘s self-equivalence. Adding it as a hypothesis to Lemma 2’ makes the lemma not directly applicable.

Question to think about: Can we derive A₂ ~ A₂ \ S ∪ {X} in the inversion lemma’s stuck case from what we have?

We have:

A₁ ~ B \ S ∪ {X} (IH).
A₁ & A₂ ~ B \ S (premise after substitution).

Hmm. From these, we have A₁ & A₂ <: B \ S syntactic, which by intersection-elim gives A₁ <: B \ S or A₂ <: B \ S syntactic (specific instance with D, C). For the general avoidance system, do we have A₁ & A₂ <: B \ S ∪ {X}? Not directly.

Actually, wait. Going back to the inversion lemma’s setup: after applying IH, we’re trying to establish A₁ & A₂ ~ B \ S ∪ {X} syntactic. We don’t yet have it. So we can’t extract A₂ ~ A₂ \ S ∪ {X} from a not-yet-proven conclusion.

I think we may need yet another reformulation

Let me try once more, being very honest about what we have and need.

Inputs to Lemma 2’ from the inversion lemma’s stuck case:

(P3): B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic — a derivation in the avoidance system about specific C, D.
(B-self): B ~ B \ S ∪ {X} syntactic — B ‘s self-equivalence in the avoidance system.

We do not have analogous self-equivalence for A₂.

Output we want:

B <: A₂ \ S ∪ {X} syntactic.

The mismatch: the swap problem in the arrow case. In the contravariant position, we’d need A₂_dom <: B_dom \ S ∪ {X} syntactic, which is the “reverse” form. To get this from a sub-derivation of (P3), we’d want a “swapped Lemma 2’” — but as noted, the swapped lemma seems to need A₂ ‘s self-equivalence, which we don’t have.

Wait. Let me think about what the contravariant sub-derivation actually says. It says A₂_dom[X→{X:C}] <: B_dom[X→{X:D}] \ S. We want to lift this to A₂_dom <: B_dom \ S ∪ {X}.

By the same argument as the outer proof, but with A₂_dom in the LHS role and B_dom in the RHS role, we’d need (for the recursive application):

A₂_dom ‘s self-equivalence (A₂_dom ~ A₂_dom \ S ∪ {X}).
The sub-derivation.

We have the sub-derivation but not A₂_dom ‘s self-equivalence.

Hmm. So the inductive structure requires self-equivalence of both sides at each level.

What if both sides’ self-equivalence is implied by the global setup?

Going back to the inversion lemma: from the premise A ~ B \ S syntactic for specific (C, D), we have A[X→{X:C}] ~ B[X→{X:D}] \ S syntactic — meaning both A[X→{X:C}] <: B[X→{X:D}] \ S and B[X→{X:D}] <: A[X→{X:C}] \ S.

Question: does this imply A ~ A \ S ∪ {X} syntactic and B ~ B \ S ∪ {X} syntactic, prior to any IH?

From A ~ B \ S for specific (C, D): by transitivity, A[X→{X:C}] ~ A[X→{X:C}] \ S syntactic. But this is reflexivity for the substituted A, not avoidance-self-equivalence.

I don’t think we get free self-equivalence in the avoidance system without doing some work.

Concluding the sketch with honesty

I want to stop and acknowledge what I’ve found:

The Lemma 2’ approach has a real difficulty I hadn’t fully anticipated: the contravariant swap. When recursing into arrow domains, the substituent pattern swaps, and the recursive call’s hypotheses don’t match what’s available.

Possible resolutions:

Generalize Lemma 2’ to handle both substitution patterns simultaneously, with hypotheses that cover both B ‘s and A₂ ‘s structural absorption. This requires deriving A₂ ‘s self-equivalence somehow — perhaps by strengthening the inversion lemma’s IH to give us self-equivalence of both A and B on the conclusion side. Specifically, if we conclude A ~ B \ S ∪ {X} syntactic, then by transitivity we get A ~ A \ S ∪ {X} and B ~ B \ S ∪ {X}. So strengthening to deliver self-equivalence is automatic — but we need it during the proof, not just at the conclusion.
Reformulate the avoidance system to make this kind of substitution-sensitive lifting easier. Maybe with a more refined notion that tracks substitution in derivations directly.
Use a different proof technique — perhaps not induction on size but on some other measure, or a Kripke-style logical relation.

I think option 1 is most promising, but it requires careful inductive structuring. Specifically: in the inversion lemma’s stuck case, before applying IH on (A₁, B), we’d want self-equivalence of A₁, A₂, and B. Self-equivalence of B might be provable from the premise directly (not just the IH), because the premise gives us B[X→{X:D}] <: A₁[X→{X:C}] <: ... <: B[X→{X:D}] cycles that constrain B.

But this is getting into uncharted territory. Let me stop here and ask:

Questions back to you

Does your subtyping system have transitivity as an admissible rule? This affects whether Lemma 1 (substitution-invariance via transitivity) can be done syntactically without difficulty.
In your paper, do you have a substitution lemma for \ S already proven — i.e., that A <: B \ S ∪ {X} syntactic implies A[X→{X:E}] <: B[X→{X:F}] \ S syntactic? This is the soundness of avoidance under substitution.
Would you be open to strengthening the inversion lemma’s conclusion to include self-equivalence of A and B? E.g., conclude not just A ~ B \ S ∪ {X} but also A ~ A \ S ∪ {X} and B ~ B \ S ∪ {X}. This might give the IH enough to handle the contravariant swap in Lemma 2’.
Have you encountered similar “swap” issues elsewhere in your work on this system? If so, what techniques did you use to resolve them?

I’m sorry I couldn’t deliver a clean proof sketch — the swap issue is genuinely thorny and I don’t want to paper over it. Let me know your thoughts and we can refine further.

You

For the proof sketch Can we instead prove this?

Lemma 2’ (syntactic). Suppose:

B ~ B \ S ∪ {X} syntactic. (Syntactic substitution-invariance of B w.r.t. X.)
B[X] <: A₂[X] \ S syntactic Then B <: A₂ \ S ∪ {X} syntactic.

The premise 2 follows from the original one by dropping { X: C or D} to X, which is safe because the record subtyping rule also compares X identical when {X: C } <: {X: D} and the C and D does not provide much information for this lemma this should avoid the swapping problem

and you are right before gettinginto this problem that although it is possible that B contains free X, we can tell in this case from the first premise that X is always avoided (even in the identity check of B) so that they don’t play actively with any type. I’m not sure how to do it exactly, but the intuition is that this seems to be the property we can exploit for proving this lemma.

Claude

Reformulating Lemma 2’ Without the Swap Problem

This is a clever move. Let me think through it carefully.

Understanding your reformulation

You’re proposing to “drop” the substitution from {X:C} and {X:D} to just X (the variable), so the premise becomes a comparison of B and A₂ with X appearing as a variable on both sides — symmetrically.

Lemma 2’ (revised). Suppose:

B ~ B \ S ∪ {X} syntactic.

B[X] <: A₂[X] \ S syntactic.

Then B <: A₂ \ S ∪ {X} syntactic.

Here B[X] and A₂[X] mean B and A₂ “with X left as a variable” — but since B and A₂ already may contain X as a variable in your setup, this is just B and A₂ themselves. The “[X]” notation is a bit confusing; let me re-read.

Looking again: in the original premise, we had B[X→{X:C}] <: A₂[X→{X:D}] \ S, and you’re saying “drop {X:C} and {X:D} to X” — meaning the substituents become just the variable X on both sides. This gives B[X→X] <: A₂[X→X] \ S = B <: A₂ \ S.

So the premise becomes: B <: A₂ \ S syntactic (with S as the avoidance set, no X substitution at all, just the original B and A₂ with X possibly free).

That’s a cleaner formulation. Let me re-state:

Lemma 2’ (revised). Suppose:

B ~ B \ S ∪ {X} syntactic.

B <: A₂ \ S syntactic. (Note: avoidance set is just S, not S ∪ {X}.)

Then B <: A₂ \ S ∪ {X} syntactic.

The conclusion strengthens the avoidance set from S to S ∪ {X}.

Does premise 2 follow from the original?

You claim: from B[X→{X:C}] <: A₂[X→{X:D}] \ S syntactic, we can derive B <: A₂ \ S syntactic.

Why? Your reasoning: in the record subtyping rule ssub_d_rcd, the label name must match (both sides have label X). The bodies C and D are compared, but the lemma’s conclusion doesn’t depend on the bodies’ specific values — so we can “abstract” them away by treating X as a variable instead of as a record label.

Let me check this transformation more carefully.

The substitution X → {X:C} replaces every occurrence of X (as typ_fvar X) with typ_rcd X C. The “inverse” replaces every typ_rcd X _ with typ_fvar X — but only those typ_rcd X _ that came from substitution, not “intrinsic” record types. Since you said X doesn’t appear free as a label in B or A₂, all typ_rcd X _ in B[X→{X:C}] and A₂[X→{X:D}] came from substitution, so the inverse is well-defined.

Apply this inverse to both sides of the derivation B[X→{X:C}] <: A₂[X→{X:D}] \ S:

LHS: B[X→{X:C}] becomes B[X→{X:C}][{X:_}→X] = B. ✓
RHS: A₂[X→{X:D}] becomes A₂. ✓
The avoidance set S is unchanged.

But does the derivation lift? The rules in your ssub_disjoint:

ssub_d_rcd: derives {X:R} <: {X:R'} \ S from R <: R' \ S. After inverse, this becomes… X <: X \ S (the variables on both sides are X). For this to be derivable in \ S, we need ssub_d_var with X ∉ S. Provided X ∉ S (which is the standing assumption — X is the new variable, not yet in S), this works.
ssub_d_top: T <: Top \ S. After inverse, still T' <: Top \ S where T' is T with records-from-substitution un-replaced. Still fine.
ssub_d_var: Y <: Y \ S with Y ∉ S. Substitution doesn’t affect non- X variables, so the inverse is identity. Fine.
Other structural rules: induct.

So yes, the derivation lifts: B[X→{X:C}] <: A₂[X→{X:D}] \ S syntactic implies B <: A₂ \ S syntactic, provided X ∉ S (which is the operative case).

Now the new Lemma 2’

Lemma 2’. If B ~ B \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then B <: A₂ \ S ∪ {X} syntactic.

Is this provable, and is the swap problem avoided?

Swap analysis: the premise B <: A₂ \ S has B and A₂ on their natural sides, with no substitution wrinkle. When we recurse into an arrow position, the contravariant sub-derivation has A₂_dom <: B_dom \ S — same form (just LHS/RHS swap), no substituent swap. ✓

For Lemma 2’, we’d want to lift this contravariant sub-derivation to A₂_dom <: B_dom \ S ∪ {X}. By the same lemma applied recursively, we’d need A₂_dom ~ A₂_dom \ S ∪ {X} syntactic.

But we don’t have A₂_dom ‘s self-equivalence. Hypothesis 1 only gives us B ‘s self-equivalence.

So the swap problem isn’t quite avoided — it morphs into a “we don’t have the recursive hypothesis for A₂ ” problem.

Trying to derive A₂’s self-equivalence

Can we derive A₂ ~ A₂ \ S ∪ {X} from the hypotheses we have?

We have:

B ~ B \ S ∪ {X} (i.e., both directions: B <: B \ S ∪ {X} and B <: B \ S ∪ {X}, the same).
B <: A₂ \ S.

Using transitivity (if it holds in \ S syntactically): does B <: A₂ \ S and A₂ <: B \ S (from the original premise’s :> direction, similarly extracted) imply A₂ ~ B \ S?

We have B <: A₂ \ S (from above) and we’d want A₂ <: B \ S to get A₂ ~ B \ S. But the original :> direction was B[X→{X:D}] <: A[X→{X:C}] \ S decomposed to B[X→{X:D}] <: A₁[X→{X:C}] \ S and B[X→{X:D}] <: A₂[X→{X:C}] \ S — we never have A₂[...] <: B[...] directly. (That’s the asymmetry of the stuck case.)

So we can’t get A₂ <: B \ S, hence not A₂ ~ B \ S, hence not A₂ ~ A₂ \ S ∪ {X} via transitivity.

Hmm. So if Lemma 2’ needs A₂ ‘s self-equivalence to recurse, and we can’t derive it, we’re stuck.

Thinking about whether A₂’s self-equivalence is actually needed

Wait. Let me reconsider whether the arrow case really needs A₂_dom ‘s self-equivalence.

Lemma 2’ says: from B ~ B \ S ∪ {X} and B <: A₂ \ S, derive B <: A₂ \ S ∪ {X}.

Arrow case: B = B_dom -> B_cod, A₂ = A₂_dom -> A₂_cod. Premise 2 gives sub-derivations:

A₂_dom <: B_dom \ S (contravariant)
B_cod <: A₂_cod \ S (covariant)

We want to lift both to \ S ∪ {X}.

For the covariant one B_cod <: A₂_cod \ S: recursing on Lemma 2’ would need B_cod ~ B_cod \ S ∪ {X}. Can we get this?

Hypothesis 1 is B ~ B \ S ∪ {X} = (B_dom -> B_cod) ~ (B_dom -> B_cod) \ S ∪ {X}. By inversion of <: for arrows: this gives us B_dom <: B_dom \ S ∪ {X} (contravariant — same form as covariant for self-equivalence) and B_cod <: B_cod \ S ∪ {X} (covariant). ✓ So we get B_cod ‘s self-equivalence from hypothesis 1’s structural inversion.

Great. The recursion on B_cod works.

For the contravariant one A₂_dom <: B_dom \ S: recursing on Lemma 2’ would need… wait, here A₂_dom is on the left and B_dom is on the right. Lemma 2’ is stated as “from B ‘s self-equivalence and B <: A₂, derive B <: A₂ \ S ∪ {X} ” — with B on the left.

For the contravariant case, the role of “the left side with self-equivalence” is now A₂_dom, and “the right side” is B_dom. So we’d need A₂_dom ~ A₂_dom \ S ∪ {X} syntactic. And we don’t have that.

A different recursion strategy

Idea: do the covariant case with one orientation and the contravariant case with the other orientation, and have both directions of self-equivalence available via a stronger hypothesis.

Strengthen Lemma 2’ to require both sides’ self-equivalence:

Lemma 2’ (strong). If B ~ B \ S ∪ {X} syntactic, A₂ ~ A₂ \ S ∪ {X} syntactic, and B <: A₂ \ S syntactic, then B <: A₂ \ S ∪ {X} syntactic.

Then the recursion goes:

Arrow covariant B_cod <: A₂_cod \ S → recurse with B_cod ‘s and A₂_cod ‘s self-equivalences (both extractable from hypothesis 1 and the new hypothesis).
Arrow contravariant A₂_dom <: B_dom \ S → recurse with A₂_dom ‘s and B_dom ‘s self-equivalences.

Both sides have self-equivalence available, so the recursion goes through.

But now we need both sides’ self-equivalences as inputs. In the inversion lemma’s stuck case, we have B ‘s but not A₂ ‘s.

How to get A₂’s self-equivalence

Going back to the inversion lemma: maybe I can derive A₂ ~ A₂ \ S ∪ {X} from the structure of the premise more cleverly.

The premise has (A₁ & A₂)[X→{X:C}] ~ B[X→{X:D}] \ S syntactic. Both directions give us derivations.

We don’t have A₂ ~ A₂ \ S directly — we’d need to construct one. In the avoidance system without X, can we derive A₂ ~ A₂ \ S?

From the premise (specifically the :> direction): B[X→{X:D}] <: A₂[X→{X:C}] \ S syntactic. By the inverse-substitution argument (the same as your premise-2 derivation): B <: A₂ \ S syntactic. By the symmetric argument applied to (P1, the <: direction picking branch 1, hmm but this only gives A₁ <: B \ S not A₂ <: B \ S)…

So we can derive B <: A₂ \ S but not A₂ <: B \ S directly. Getting A₂ ~ A₂ \ S would require more.

Reflexivity of A₂ in \ S: since X ∉ S, the variable X (free in A₂, possibly) is not blocked. So A₂ <: A₂ \ S should hold by structural reflexivity (which holds in \ S for any type whose free variables avoid S — and A₂ ‘s free variables include X, but X ∉ S, so this is fine).

So A₂ ~ A₂ \ S is just reflexivity in \ S, which holds. ✓

But we want A₂ ~ A₂ \ S ∪ {X}. Now X ∈ S ∪ {X}, and A₂ may contain X. Reflexivity is not automatic.

So we cannot directly get A₂ ~ A₂ \ S ∪ {X}. This is the crux.

What’s going on structurally

Here’s what I think is happening: the lemma we want to prove is, essentially, “given that B ‘s X is structurally absorbed (hypothesis 1), and given B <: A₂ (premise 2), A₂ ‘s X -occurrences must also be in absorbed positions (because of the structural matching that <: enforces).” So A₂ ‘s self-equivalence in \ S ∪ {X} should be derivable from the premises, not a separate hypothesis.

This is the structural propagation argument: absorption of X in B forces absorption of X in A₂ (when B <: A₂ holds in \ S).

Let me try to prove this:

Lemma 2” (the absorption propagation we actually need). If B ~ B \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then A₂ ~ A₂ \ S ∪ {X} syntactic.

If this holds, then combined with Lemma 2’ (strong), we get the original Lemma 2’.

Proof attempt by induction on the derivation of B <: A₂ \ S:

Case ssub_d_top: A₂ = Top. Conclusion: Top ~ Top \ S ∪ {X} by ssub_d_top both directions. ✓
Case ssub_d_nat: A₂ = nat. Conclusion: nat ~ nat \ S ∪ {X} by ssub_d_nat. ✓
Case ssub_d_var: A₂ = typ_fvar Y, Y ∉ S, B = typ_fvar Y. From hypothesis 1, Y ~ Y \ S ∪ {X} syntactic. The only way is ssub_d_var requiring Y ∉ S ∪ {X}. So Y ≠ X (since otherwise Y = X ∈ S ∪ {X}, blocked). Conclusion: Y ~ Y \ S ∪ {X} by ssub_d_var (Y ∉ S ∪ {X} since Y ∉ S and Y ≠ X). ✓
Case ssub_d_arrow: A₂ = A_dom -> A_cod, B = B_dom -> B_cod, with sub-derivations A_dom <: B_dom \ S and B_cod <: A_cod \ S. Hypothesis 1 inverts to: B_dom <: B_dom \ S ∪ {X} (contravariant — with reversed roles, but for self-equivalence this is the same as covariant) and B_cod <: B_cod \ S ∪ {X}.

Wait, let me redo. Hypothesis 1 says B <: B \ S ∪ {X}. Inversion: arrow rule. B_dom <: B_dom \ S ∪ {X} (contravariant) and B_cod <: B_cod \ S ∪ {X} (covariant). Both are self-comparisons, so direction doesn’t matter — both give B_dom ‘s and B_cod ‘s self-equivalence in \ S ∪ {X}. ✓

Now apply IH (Lemma 2”) to the sub-derivations:

IH on A_dom <: B_dom \ S with B_dom ‘s self-equivalence: gives A_dom ~ A_dom \ S ∪ {X}. Wait, this uses the swapped version of Lemma 2” — we have B_dom ‘s self-equivalence (the right side of A_dom <: B_dom), not A_dom ‘s (the left side).

Hmm. Let me re-read my Lemma 2” statement:

If B ~ B \ S ∪ {X} and B <: A₂ \ S, then A₂ ~ A₂ \ S ∪ {X}.

The hypothesis is “left side has self-equivalence, sub-derivation left <: right.” Conclusion: “right side has self-equivalence.”

In the contravariant arrow case, sub-derivation is A_dom <: B_dom \ S — so left side is A_dom, right side is B_dom. Lemma 2” would give: from “self-equivalence of A_dom ” and ” A_dom <: B_dom \ S,” conclude “self-equivalence of B_dom.”

But we have self-equivalence of B_dom already (from inversion of hypothesis 1), and we want self-equivalence of A_dom. The lemma direction is wrong for this case.

We’d need a “reverse” version:

Lemma 2”-reverse. If A₂ ~ A₂ \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then B ~ B \ S ∪ {X} syntactic.

Then in the contravariant arrow case: from B_dom ‘s self-equivalence (which we have) and A_dom <: B_dom \ S (the sub-derivation), conclude A_dom ‘s self-equivalence.

So we need both Lemma 2” and Lemma 2”-reverse, mutually recursive. They handle the covariant and contravariant cases respectively in the arrow rule.

Let me verify: covariant case B_cod <: A_cod \ S. We have B_cod ‘s self-equivalence (from hypothesis 1) and the sub-derivation. Lemma 2” (forward direction): conclude A_cod ‘s self-equivalence. ✓

Contravariant case A_dom <: B_dom \ S. We have B_dom ‘s self-equivalence and the sub-derivation. Lemma 2”-reverse: conclude A_dom ‘s self-equivalence. ✓

Both subcases in the arrow rule work with mutual recursion. Let me check the other rules.

Case ssub_d_and (intersection introduction on the right): A₂ = A_2a & A_2b, with sub-derivations B <: A_2a \ S and B <: A_2b \ S. By IH (forward): A_2a and A_2b self-equivalences. Conclusion: A₂ = A_2a & A_2b ~ A_2a & A_2b \ S ∪ {X} — derive each direction:
- A_2a & A_2b <: A_2a & A_2b \ S ∪ {X} requires A_2a & A_2b <: A_2a \ S ∪ {X} and ... <: A_2b \ S ∪ {X}. Use ssub_d_andL (or R) and the self-equivalences. Need A_2a and A_2b to be ordinary for ssub_d_andL/R to apply. Hmm — they may not be. If A_2a is an intersection itself, the rule doesn’t apply directly. Actually wait, ssub_d_andL says: A_2a & A_2b <: B \ D if A_2a <: B \ D and B is ordinary. So to derive A_2a & A_2b <: A_2a \ S ∪ {X}, we’d need A_2a to be ordinary. If A_2a is itself an intersection, we’d need to further decompose. In general, deriving T1 & T2 <: T1 for non-ordinary T1 requires a more complex derivation. Hmm, this is a wrinkle. Maybe we need an auxiliary “splittability” lemma about your intersection rules. I’ll defer this and assume it’s handled by your system’s standard meta-theory.
Case ssub_d_andL / ssub_d_andR: B = B_a & B_b, A₂ ordinary. Sub-derivation: B_a <: A₂ \ S (or B_b). Hypothesis 1: B_a & B_b ~ B_a & B_b \ S ∪ {X} syntactic. By inversion, this gives B_a & B_b <: B_a \ S ∪ {X} and B_a & B_b <: B_b \ S ∪ {X} (from the right :> direction of self-equivalence applied to both branches), and the <: direction gives B_a & B_b <: B_a & B_b which is just self-applied. We need self-equivalence of B_a and B_b to apply IH. From B_a & B_b ~ B_a & B_b \ S ∪ {X} syntactic, can we get B_a ~ B_a \ S ∪ {X} syntactic? The <: direction B_a & B_b <: B_a & B_b \ S ∪ {X} gives us B_a & B_b <: B_a \ S ∪ {X} by inversion (since the right is B_a & B_b, intersection introduction requires both ... <: B_a and ... <: B_b). But getting B_a <: B_a \ S ∪ {X} directly is a different claim — would need a separate argument. Hmm. Actually, maybe the IH should give us self-equivalence of components of B. If B_a & B_b ~ B_a & B_b \ S ∪ {X} syntactic and the avoidance system has good structural properties, then B_a ~ B_a \ S ∪ {X} should be derivable. This is essentially asking that the avoidance system has a “subterm” property: subterms of self-equivalent types are self-equivalent. Let me take this on faith for now (it’s a property worth verifying separately).
Case ssub_d_rcd: B = typ_rcd l B', A₂ = typ_rcd l A', sub-derivation B' <: A' \ S. Hypothesis 1: typ_rcd l B' ~ typ_rcd l B' \ S ∪ {X}, which by inversion gives B' <: B' \ S ∪ {X} (i.e., B' ‘s self-equivalence). IH (forward) on sub-derivation: gives A' ‘s self-equivalence. Conclusion: typ_rcd l A' ~ typ_rcd l A' \ S ∪ {X} by ssub_d_rcd. ✓

So the cases mostly work, modulo the technical wrinkles around intersection structural properties.

Putting it together

The plan:

Lemma 2” (forward). If B ~ B \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then A₂ ~ A₂ \ S ∪ {X} syntactic.

Lemma 2”-reverse. If A₂ ~ A₂ \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then B ~ B \ S ∪ {X} syntactic.

These are mutually recursive, with the recursion handling covariant (forward) and contravariant (reverse) positions in arrows.

Lemma 2’ (final). If B ~ B \ S ∪ {X} syntactic and B <: A₂ \ S syntactic, then B <: A₂ \ S ∪ {X} syntactic.

By induction on the derivation of B <: A₂ \ S, using Lemma 2” and 2”-reverse to obtain self-equivalences when recursing into arrow positions.

Actually, I realize Lemma 2’ might be derivable directly without separately invoking Lemmas 2” and 2”-reverse. The induction on B <: A₂ \ S walks through both types’ structures, and at each step, the avoidance set just grows from S to S ∪ {X}. The hypothesis 1 ensures that wherever X would be problematic (e.g., a ssub_d_var X step on B ‘s side), it’s absorbed. Provided A₂ ‘s X is also absorbed, which we get from the structural matching of <:.

Let me try Lemma 2’ directly:

Lemma 2’ direct proof attempt by induction on B <: A₂ \ S:

ssub_d_top: A₂ = Top. Conclusion B <: Top \ S ∪ {X} by ssub_d_top (need lc_typ B, which we have). ✓
ssub_d_nat: B = A₂ = nat. Conclusion nat <: nat \ S ∪ {X}. ✓
ssub_d_var: B = A₂ = typ_fvar Y, Y ∉ S. From hypothesis 1, Y <: Y \ S ∪ {X}, requiring Y ∉ S ∪ {X}. So Y ≠ X. Conclusion: Y <: Y \ S ∪ {X} by ssub_d_var. ✓
ssub_d_arrow: Sub-derivations A₂_dom <: B_dom \ S and B_cod <: A₂_cod \ S. From hypothesis 1 inverted: B_dom <: B_dom \ S ∪ {X} and B_cod <: B_cod \ S ∪ {X}.
- Covariant: IH on B_cod <: A₂_cod \ S with B_cod ‘s self-equivalence: gives B_cod <: A₂_cod \ S ∪ {X}. ✓
  - Contravariant: IH on A₂_dom <: B_dom \ S with… A₂_dom ‘s self-equivalence? We don’t have it directly.

So the contravariant case in Lemma 2’ direct proof needs A₂_dom ‘s self-equivalence, which we’d get from Lemma 2” (forward) applied to the sub-derivation. So the direct proof of Lemma 2’ actually does need Lemma 2” as a sub-step.

Concretely: in the contravariant case, before applying IH on Lemma 2’, first apply Lemma 2” (forward) to A₂_dom <: B_dom \ S with B_dom ‘s self-equivalence (wait, that’s reverse direction)… hmm.

OK let me re-examine. The sub-derivation is A₂_dom <: B_dom \ S. To apply Lemma 2’ to it, we’d need self-equivalence of A₂_dom (the left side). To get A₂_dom ‘s self-equivalence, use Lemma 2”-reverse: input is B_dom ‘s self-equivalence (have) and A₂_dom <: B_dom \ S (have); output is A₂_dom ‘s self-equivalence. ✓

So:

Lemma 2”-reverse: A₂_dom ‘s self-equivalence from B_dom ‘s self-equivalence and the sub-derivation.
Lemma 2’ (recursive): A₂_dom <: B_dom \ S ∪ {X} from A₂_dom ‘s self-equivalence (just derived) and the sub-derivation.

So Lemma 2’ invokes Lemma 2”-reverse, which itself recurses. The mutual recursion is between Lemma 2’ and 2”-reverse (and possibly 2”-forward, depending on how things shake out).

Honest summary

I think the approach works, with this structure:

Lemma 2” (forward). From B ‘s self-eq and B <: A₂ \ S, derive A₂ ‘s self-eq in \ S ∪ {X}.

Lemma 2”-reverse. From A₂ ‘s self-eq and B <: A₂ \ S, derive B ‘s self-eq in \ S ∪ {X}.

Lemma 2’ (the goal). From B ‘s self-eq and B <: A₂ \ S, derive B <: A₂ \ S ∪ {X}.

These three lemmas are mutually inductive on the size of the derivation B <: A₂ \ S (or possibly types). The arrow case in Lemma 2’ uses Lemma 2”-reverse to bootstrap A₂_dom ‘s self-equivalence, then recurses.

Caveats:

Intersection cases need careful handling. The ssub_d_and (right-intersection) and ssub_d_andL/R (left-intersection) cases require deriving self-equivalences of intersection components from self-equivalence of the whole. This is a structural property of your system that I’m assuming holds but haven’t fully verified.
Inversion of self-equivalence into component self-equivalences is the key auxiliary lemma needed. Specifically: if T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} syntactic, then both T₁ ~ T₁ \ S ∪ {X} and T₂ ~ T₂ \ S ∪ {X} syntactic. This needs to be checked carefully.
The “drop substituents to X ” transformation of premise 2 is sound, given X ∉ S and X doesn’t appear free as a label originally. This was your insight and seems right.
The system’s transitivity is needed at various places. Assumed admissible.

This formulation does avoid the substituent-swap problem from before. Instead it has a “self-equivalence propagation” obligation, which is more tractable: it’s a structural property that should hold cleanly.

Want me to sketch one of the cases in more detail (e.g., the intersection cases) to verify, or is this enough to check whether it fits with your formalization?

You

Can you first give me a proof draft of this lemma directly by induction on the second premise? Before you move on to try to think of other fancy lemmas, show me how it goes or show me how it is stuck so I can understand the problem.

Lemma 2’ (syntactic). Suppose:

B ~ B \ S ∪ {X} syntactic. (Syntactic substitution-invariance of B w.r.t. X.)
B[X] <: A₂[X] \ S syntactic Then B <: A₂ \ S ∪ {X} syntactic.

Claude

Direct Proof Sketch by Induction on Premise 2

Let me try this honestly, going case-by-case on the last rule of the derivation B <: A₂ \ S (where I’m reading premise 2 as the derivation in \ S of B against A₂, both possibly mentioning X freely as a variable, with X ∉ S).

The induction hypothesis: for any sub-derivation B' <: A₂' \ S of strictly smaller size, if B' ~ B' \ S ∪ {X} holds, then B' <: A₂' \ S ∪ {X}.

Below I’ll trace each rule and check whether it goes through.

Case ssub_d_top

A₂ = Top. We want B <: Top \ S ∪ {X}. By ssub_d_top (need lc_typ B, which we have).

✓ Closes directly. No use of premise 1 needed.

Case ssub_d_nat

B = A₂ = nat. We want nat <: nat \ S ∪ {X} by ssub_d_nat.

✓ Closes directly.

Case ssub_d_var

B = A₂ = typ_fvar Y, with Y ∉ S. We want Y <: Y \ S ∪ {X} by ssub_d_var, needing Y ∉ S ∪ {X}.

We have Y ∉ S already. We need Y ≠ X.

How do we know Y ≠ X? Use premise 1: Y ~ Y \ S ∪ {X}. The only way to derive Y <: Y \ S ∪ {X} is via ssub_d_var, which requires Y ∉ S ∪ {X}. So Y ≠ X.

✓ Closes using premise 1.

Case ssub_d_arrow

B = B_dom -> B_cod, A₂ = A_dom -> A_cod. Sub-derivations:

(sub-a) A_dom <: B_dom \ S
(sub-b) B_cod <: A_cod \ S

Premise 1 inverted: (B_dom -> B_cod) <: (B_dom -> B_cod) \ S ∪ {X}. The only rule applicable (other than ssub_d_top if B = Top, which it isn’t) is ssub_d_arrow, giving:

B_dom <: B_dom \ S ∪ {X} (contravariant — but for self, direction doesn’t matter)
B_cod <: B_cod \ S ∪ {X} (covariant)

So B_dom ~ B_dom \ S ∪ {X} and B_cod ~ B_cod \ S ∪ {X} syntactically.

Covariant sub-goal: want B_cod <: A_cod \ S ∪ {X}. Apply IH to (sub-b) with B_cod ‘s self-equivalence. ✓

Contravariant sub-goal: want A_dom <: B_dom \ S ∪ {X}. Apply IH to (sub-a)… but the IH requires self-equivalence of the left type of the sub-derivation, which is A_dom. We don’t have A_dom ~ A_dom \ S ∪ {X}.

We have B_dom ‘s self-equivalence, not A_dom ‘s.

❌ STUCK.

This is the swap problem you anticipated avoiding. Let me check whether the reformulation actually avoids it.

The original swap was about substituents ({X:C} vs {X:D} flipping when crossing contravariant boundaries). With your reformulation (drop substituents, work with X as variable), the substituent-flip is gone — but the self-equivalence side still flips.

In the contravariant arrow case, the LHS of the inner subtyping A_dom <: B_dom has A_dom (which was on the right of the outer B <: A₂). The IH wants self-equivalence of the inner LHS, which is the outer’s RHS — a side we never had.

So the swap problem morphs from “substituent flip” to “self-equivalence side flip.” It hasn’t been eliminated, just moved.

Case ssub_d_and (intersection introduction on right)

A₂ = A_a & A_b, sub-derivations B <: A_a \ S and B <: A_b \ S.

We want B <: A_a & A_b \ S ∪ {X}. By ssub_d_and: derive B <: A_a \ S ∪ {X} and B <: A_b \ S ∪ {X}.

Apply IH to each sub-derivation, with B ‘s self-equivalence (which we have unchanged from premise 1).

✓ Closes using IH and premise 1. The B -side self-equivalence stays the same; we don’t need anything about A_a or A_b separately.

Case ssub_d_andL (intersection elimination on left, picking branch a)

B = B_a & B_b, A₂ ordinary. Sub-derivation: B_a <: A₂ \ S.

We want B_a & B_b <: A₂ \ S ∪ {X}. By ssub_d_andL: derive B_a <: A₂ \ S ∪ {X} (and side conditions lc_typ B_b, ord A₂).

Apply IH to (sub-a): need B_a ‘s self-equivalence in \ S ∪ {X}.

We have B_a & B_b ~ B_a & B_b \ S ∪ {X} (premise 1). Does this give us B_a ~ B_a \ S ∪ {X}?

Let me think. B_a & B_b <: B_a & B_b \ S ∪ {X} could be derived in several ways. Let’s invert.

The RHS is B_a & B_b, an intersection. By ssub_d_and (only applicable rule for intersection on the right), we need B_a & B_b <: B_a \ S ∪ {X} and B_a & B_b <: B_b \ S ∪ {X}.

Now the LHS is also B_a & B_b, an intersection. The first sub-derivation B_a & B_b <: B_a \ S ∪ {X} requires B_a ordinary (for ssub_d_andL / R) — wait, actually ssub_d_andL requires the RHS to be ordinary. So it needs B_a ordinary.

If B_a is ordinary, then B_a & B_b <: B_a \ S ∪ {X} derives via ssub_d_andL from B_a <: B_a \ S ∪ {X} (✓ — what we want!). So we get B_a <: B_a \ S ∪ {X}. ✓

If B_a is not ordinary (i.e., is itself an intersection), the inversion gets recursive — but in any case, we should be able to extract B_a ‘s self-equivalence by deeper inversion. Let me assume this auxiliary inversion lemma holds: self-equivalence of B_a & B_b implies self-equivalence of B_a (and similarly B_b), derivable by inversion + intersection elimination.

Granting this lemma: ✓ Closes using IH, premise 1, and an auxiliary inversion property.

Case ssub_d_andR (intersection elimination on left, picking branch b)

Symmetric to ssub_d_andL. ✓ with the same auxiliary lemma.

Case ssub_d_rcd

B = typ_rcd l B', A₂ = typ_rcd l A', sub-derivation B' <: A' \ S.

We want typ_rcd l B' <: typ_rcd l A' \ S ∪ {X} by ssub_d_rcd from B' <: A' \ S ∪ {X}.

Apply IH to sub-derivation: need B' ‘s self-equivalence.

Premise 1: typ_rcd l B' <: typ_rcd l B' \ S ∪ {X}. By inversion (only ssub_d_rcd applies for label-on-right when LHS is also a label), gives B' <: B' \ S ∪ {X}. ✓

✓ Closes using IH and premise 1.

(Note: the label l here is just the structural identifier; in our setup, l = X would mean an X -labeled record, which we said cannot appear in the original B, A₂ — but it might appear in the substituted versions. In your reformulation where we dropped substituents, l is some other label name, fine.)

Summary of where it stuck

The proof goes through cleanly for:

ssub_d_top (trivial)
ssub_d_nat (trivial)
ssub_d_var (uses premise 1)
ssub_d_and (uses IH straightforwardly)
ssub_d_andL/R (uses IH + auxiliary inversion of premise 1)
ssub_d_rcd (uses IH + inversion of premise 1)

It gets stuck in ssub_d_arrow, specifically the contravariant case. The IH wants A_dom ‘s self-equivalence (the RHS of the outer becomes the LHS of the inner under contravariance), and we only have B ‘s self-equivalence.

This is the same fundamental obstruction as before. Reformulating the lemma to drop substituents avoided the substituent-swap, but the self-equivalence side-swap remains.

What this tells us

The reformulation cleans up the bookkeeping (no substituent swaps) but doesn’t resolve the underlying issue: the proof of Lemma 2’ inherently needs both sides’ self-equivalence available at recursive arrow-contravariant steps.

Three honest paths from here:

Mutual recursion with auxiliary lemma: Add a “propagation” lemma (Lemma 2” as I sketched before) that derives A₂ ‘s self-equivalence from B ‘s self-equivalence and B <: A₂. Use mutual recursion to handle the contravariant case.
Strengthen Lemma 2’ to take both sides’ self-equivalence as hypothesis: Then in the inversion lemma’s stuck case, derive A₂ ‘s self-equivalence separately (perhaps by Lemma 2” or another route) before applying Lemma 2’.
Different induction: Maybe induct on something other than the derivation of premise 2 — though I don’t immediately see what would work better.

The honest takeaway is that the contravariant arrow case is the genuine technical obstacle, and any proof has to handle it via some mechanism that propagates self-equivalence across the contravariance boundary. Your reformulation didn’t eliminate this need; it just localizes the difficulty more cleanly.

Want to explore option 1 (mutual recursion) or option 2 (strengthen hypotheses) next?

You

I see how about this one?

Lemma 2’ (syntactic). Suppose: A <: B \ S then (1) if A ~ A \ S ∪ {X} then A <: B \ S ∪ {X} (2) if B ~ B \ S ∪ {X} then A <: B \ S ∪ {X}

Can this be proved by induction on the premise and prove two goals simultaneously?

Give me a proof draft

Claude

Proof Draft: Strengthened Lemma 2’ with Two Goals

This is a clever move — by stating both goals together, the mutual recursion becomes a single induction proving both at once, with each goal feeding the other at the contravariant arrow step.

Let me try this carefully.

Statement

Lemma 2’. Suppose A <: B \ S syntactic. Then: (1) If A ~ A \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic. (2) If B ~ B \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic.

We prove (1) and (2) simultaneously by induction on the derivation of A <: B \ S.

Cases

Case ssub_d_nat

A = B = nat. Want nat <: nat \ S ∪ {X} by ssub_d_nat. ✓ for both (1) and (2).

Case ssub_d_top

B = Top. Want A <: Top \ S ∪ {X} by ssub_d_top (need lc_typ A, given). ✓ for both.

Case ssub_d_var

A = B = typ_fvar Y, Y ∉ S. Want Y <: Y \ S ∪ {X} by ssub_d_var, requiring Y ∉ S ∪ {X}, i.e., Y ≠ X.

For (1): premise is Y ~ Y \ S ∪ {X}. The only way to derive this is ssub_d_var, which requires Y ∉ S ∪ {X}. So Y ≠ X. ✓

For (2): same — Y ~ Y \ S ∪ {X} forces Y ≠ X. ✓

Case ssub_d_arrow

A = A_dom -> A_cod, B = B_dom -> B_cod. Sub-derivations:

(sub-contra) B_dom <: A_dom \ S
(sub-cov) A_cod <: B_cod \ S

Want: (A_dom -> A_cod) <: (B_dom -> B_cod) \ S ∪ {X} via ssub_d_arrow from B_dom <: A_dom \ S ∪ {X} and A_cod <: B_cod \ S ∪ {X}.

For (1): Premise (A_dom -> A_cod) ~ (A_dom -> A_cod) \ S ∪ {X}. Inversion gives A_dom <: A_dom \ S ∪ {X} and A_cod <: A_cod \ S ∪ {X} — i.e., self-eq of A_dom and A_cod.

Sub-contra: B_dom <: A_dom \ S. Apply IH (2) with A_dom ‘s self-equivalence (since A_dom is the RHS of this sub-derivation). Get B_dom <: A_dom \ S ∪ {X}. ✓
Sub-cov: A_cod <: B_cod \ S. Apply IH (1) with A_cod ‘s self-equivalence (since A_cod is the LHS). Get A_cod <: B_cod \ S ∪ {X}. ✓

Combine via ssub_d_arrow. ✓ for (1).

For (2): Premise (B_dom -> B_cod) ~ (B_dom -> B_cod) \ S ∪ {X}. Inversion gives B_dom ‘s and B_cod ‘s self-equivalences.

Sub-contra: B_dom <: A_dom \ S. Apply IH (1) with B_dom ‘s self-equivalence (LHS). Get B_dom <: A_dom \ S ∪ {X}. ✓
Sub-cov: A_cod <: B_cod \ S. Apply IH (2) with B_cod ‘s self-equivalence (RHS). Get A_cod <: B_cod \ S ∪ {X}. ✓

Combine via ssub_d_arrow. ✓ for (2).

🎉 The arrow case works for both goals! This is exactly what the dual formulation buys us: at each sub-derivation, we have a choice of which goal’s IH to apply based on which side’s self-equivalence we have.

Case ssub_d_and (intersection introduction on right)

B = B_a & B_b, sub-derivations A <: B_a \ S and A <: B_b \ S.

Want A <: B_a & B_b \ S ∪ {X} via ssub_d_and from A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}.

For (1): Premise A ~ A \ S ∪ {X}. We have A ‘s self-equivalence. Apply IH (1) to each sub-derivation with A ‘s self-equivalence (unchanged — it’s the same A on the LHS in both sub-derivations). Get A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}. Combine via ssub_d_and. ✓

For (2): Premise B_a & B_b ~ B_a & B_b \ S ∪ {X}. We need B_a ‘s and B_b ‘s self-equivalences to apply IH (2) on each sub-derivation.

Subclaim: B_a & B_b ~ B_a & B_b \ S ∪ {X} syntactic implies B_a ~ B_a \ S ∪ {X} syntactic and B_b ~ B_b \ S ∪ {X} syntactic.

This needs verification. Inversion: the derivation of B_a & B_b <: B_a & B_b \ S ∪ {X} must use ssub_d_and (intersection introduction on the right), giving sub-derivations:

B_a & B_b <: B_a \ S ∪ {X}
B_a & B_b <: B_b \ S ∪ {X}

If B_a is ordinary, the first must use ssub_d_andL or ssub_d_andR. Say ssub_d_andL: gives B_a <: B_a \ S ∪ {X}. ✓ (Self-equivalence of B_a directly.)

If B_a is itself an intersection, we recurse — by induction on B_a ‘s structure, we extract self-equivalences of its components.

Granting this subclaim (which is a structural property of your system, provable by induction on the structure of B_a):

Apply IH (2) to each sub-derivation with B_a ‘s and B_b ‘s self-equivalences respectively. Get A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}. Combine via ssub_d_and. ✓

Case ssub_d_andL (intersection elimination on left, branch a)

A = A_a & A_b, B ordinary. Sub-derivation: A_a <: B \ S. Side conditions: lc_typ A_b, ord B.

Want A_a & A_b <: B \ S ∪ {X} via ssub_d_andL from A_a <: B \ S ∪ {X}.

For (1): Premise A_a & A_b ~ A_a & A_b \ S ∪ {X}. By the subclaim above, A_a ~ A_a \ S ∪ {X} syntactic. Apply IH (1) to sub-derivation with A_a ‘s self-equivalence: get A_a <: B \ S ∪ {X}. Apply ssub_d_andL (with lc_typ A_b, ord B from premise). ✓

For (2): Premise B ~ B \ S ∪ {X}. We have B ‘s self-equivalence, unchanged. Apply IH (2) to sub-derivation with B ‘s self-equivalence: get A_a <: B \ S ∪ {X}. Apply ssub_d_andL. ✓

Case ssub_d_andR

Symmetric to ssub_d_andL. ✓ for both.

Case ssub_d_rcd

A = typ_rcd l A', B = typ_rcd l B'. Sub-derivation: A' <: B' \ S.

Want typ_rcd l A' <: typ_rcd l B' \ S ∪ {X} via ssub_d_rcd from A' <: B' \ S ∪ {X}.

For (1): Premise typ_rcd l A' ~ typ_rcd l A' \ S ∪ {X}. Inversion via ssub_d_rcd: gives A' <: A' \ S ∪ {X}. Apply IH (1) to sub-derivation with A' ‘s self-equivalence. ✓

For (2): Premise typ_rcd l B' ~ typ_rcd l B' \ S ∪ {X}. Inversion: gives B' <: B' \ S ∪ {X}. Apply IH (2) to sub-derivation with B' ‘s self-equivalence. ✓

Summary

🎉 All cases close. The dual formulation works.

The key insight is in ssub_d_arrow: when we recurse into the contravariant sub-derivation B_dom <: A_dom \ S, the LHS of the inner is B_dom (the original RHS) and the RHS is A_dom (the original LHS). Whichever goal we’re proving (1 or 2):

If proving (1) on the outer (we have A ‘s self-eq, hence A_dom ‘s), the inner has A_dom on its RHS, so we apply (2) of the IH (which expects RHS self-equivalence).
If proving (2) on the outer (we have B ‘s self-eq, hence B_dom ‘s), the inner has B_dom on its LHS, so we apply (1) of the IH (which expects LHS self-equivalence).

The dual statement creates a tight feedback loop where each direction’s IH supplies what the other needs at contravariance flips.

Auxiliary lemma needed

Self-equivalence inversion for intersections: If T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} syntactic, then T₁ ~ T₁ \ S ∪ {X} and T₂ ~ T₂ \ S ∪ {X} syntactic.

This is needed in the ssub_d_and case for (2) and the ssub_d_andL/R case for (1). It should hold by structural inversion on the derivation of the self-equivalence, decomposing the intersection rules. The full proof requires care because the rule application can vary depending on whether T₁ or T₂ is itself ordinary or further intersected, but the structural intuition is sound.

Caveats I want to flag honestly

The intersection self-equivalence inversion isn’t proven here — I sketched it informally. For your formalization, you’d want to write it out and verify it goes through cleanly with your specific rule set. There might be subtleties depending on how ord interacts with intersection components.
The ssub_d_arrow inversion I used (premise 1 invert to give domain and codomain self-equivalences) assumes that an arrow’s self-equivalence in \ S ∪ {X} can only be derived by ssub_d_arrow, not by some indirect route (e.g., through Top if Top <: arrow were derivable, which it isn’t, but worth confirming). Looking at your rules, the only rule for _ <: arrow is ssub_d_arrow itself or intersection-related (which doesn’t apply when both sides are arrows, not intersections). So the inversion is clean.
The induction is on the derivation size, and each case strictly decreases. This should be straightforward to formalize.
I haven’t fully checked all possible derivation shapes — for example, if your system allows alternative derivations of A <: B \ S that don’t directly correspond to a single rule application, things might be more subtle. But your rule set looks deterministic enough that this shouldn’t be an issue.

This formulation looks like the right one to me. The dual goals dovetail beautifully through the contravariance flip, and the rest of the cases are straightforward. Want to proceed to using this Lemma 2’ in the inversion lemma proof, or do you want to first work out the intersection self-equivalence inversion auxiliary lemma in detail?

You

Can you check the subroutine more closely? I don’t think you can infer from two intersections being equivalent to each other that the two components match each other precisely.

Claude

Checking the Intersection Self-Equivalence Inversion

You’re right to push on this. Let me look at the subclaim carefully.

The subclaim

Subclaim: T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} syntactic implies T₁ ~ T₁ \ S ∪ {X} syntactic and T₂ ~ T₂ \ S ∪ {X} syntactic.

Let me try to derive this and see where it fails.

Attempted derivation

T₁ & T₂ <: T₁ & T₂ \ S ∪ {X} syntactic. The RHS is an intersection, so the only rule that can introduce it is ssub_d_and, giving:

T₁ & T₂ <: T₁ \ S ∪ {X}
T₁ & T₂ <: T₂ \ S ∪ {X}

Now consider T₁ & T₂ <: T₁ \ S ∪ {X}. The LHS is an intersection. If T₁ is ordinary, the rules ssub_d_andL (giving T₁ <: T₁ \ S ∪ {X} from the left branch) OR ssub_d_andR (giving T₂ <: T₁ \ S ∪ {X} from the right branch) apply.

This is the crux of your concern. The rule could pick either branch. We don’t know that the derivation chose ssub_d_andL (giving us T₁ <: T₁) — it might have chosen ssub_d_andR, giving us T₂ <: T₁.

Concrete problematic example

Consider T₁ = Top -> Top, T₂ = Top -> Top. Then T₁ & T₂ = (Top -> Top) & (Top -> Top). Self-equivalence holds trivially. To prove T₁ & T₂ <: T₁, the derivation could pick either branch — both work. From the derivation alone, we can’t tell which.

But in this example, T₁ = T₂, so it doesn’t matter — we get T₁ <: T₁ either way.

Try a non-trivial example: T₁ = Top -> Top, T₂ = Top -> X (where X is the avoidance variable). Then T₂ ~ T₂ \ S ∪ {X} requires Top -> X <: Top -> X \ S ∪ {X}, requiring X <: X \ S ∪ {X} — blocked, since X ∈ S ∪ {X}. So T₂ ~ T₂ \ S ∪ {X} fails.

Does T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} hold for this T₁ & T₂ = (Top -> Top) & (Top -> X)? We need T₁ & T₂ <: T₁ & T₂ \ S ∪ {X}. Decompose:

T₁ & T₂ <: T₁ \ S ∪ {X} = (Top -> Top) & (Top -> X) <: Top -> Top \ S ∪ {X}. Pick ssub_d_andL: Top -> Top <: Top -> Top \ S ∪ {X} ✓ (need lc_typ T₂ ✓, ord (Top -> Top) ✓).
T₁ & T₂ <: T₂ \ S ∪ {X} = (Top -> Top) & (Top -> X) <: Top -> X \ S ∪ {X}. Pick ssub_d_andR: Top -> X <: Top -> X \ S ∪ {X} requires X <: X \ S ∪ {X} — blocked. Pick ssub_d_andL instead: Top -> Top <: Top -> X \ S ∪ {X} requires (covariantly) Top <: X \ S ∪ {X} — fails (Top is only <: of Top or via reflexivity, not of X).

So T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} fails for this example. Good — the subclaim’s hypothesis isn’t satisfied here, so this isn’t a counterexample yet. But it does show that the active X in T₂ propagates to the intersection’s self-equivalence.

Try harder: when does the subclaim’s hypothesis hold but conclusion fail?

I need an example where T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} holds but, say, T₁ ~ T₁ \ S ∪ {X} fails.

For T₁ ~ T₁ \ S ∪ {X} to fail, T₁ must have X in an “active” position (e.g., T₁ = Top -> X, with X covariantly active in the codomain).

For T₁ & T₂ <: T₁ & T₂ \ S ∪ {X} to hold, we need T₁ & T₂ <: T₁ \ S ∪ {X}. With T₁ = Top -> X and T₁ ordinary, we use ssub_d_andL or ssub_d_andR:

ssub_d_andL: T₁ <: T₁ \ S ∪ {X} = Top -> X <: Top -> X \ S ∪ {X} requires X <: X \ S ∪ {X} — blocked.
ssub_d_andR: T₂ <: T₁ \ S ∪ {X} = T₂ <: Top -> X \ S ∪ {X}. For this to succeed, T₂ must structurally match (be an arrow), with covariant codomain <: X \ S ∪ {X}. The only way something <: X \ S ∪ {X} is X <: X (blocked) or via Top on the right (but the right is X, not Top). So this requires the LHS to be X itself — i.e., T₂ = ? -> X, but then again X <: X is blocked.

So no T₂ makes T₂ <: Top -> X \ S ∪ {X} derivable. Hence T₁ & T₂ <: T₁ \ S ∪ {X} cannot be derived for this T₁. So the subclaim’s hypothesis fails — not a counterexample.

Hmm, the active X in T₁ seems to propagate to the intersection unavoidably. Let me try a more clever construction.

A more clever attempt

What if T₁ has X actively but the intersection structure “redirects” through T₂?

T₁ = X, T₂ = Top. Then T₁ & T₂ = X & Top.

T₁ & T₂ <: T₁ \ S ∪ {X} = X & Top <: X \ S ∪ {X}. With X ordinary on the right:

ssub_d_andL: X <: X \ S ∪ {X} — blocked.
ssub_d_andR: Top <: X \ S ∪ {X} — fails (Top is only <: of Top).

So can’t derive. Subclaim hypothesis fails.

T₁ = Top -> X, T₂ = X -> Top (so each has X actively, in different polarity positions). T₁ & T₂ <: T₁ \ S ∪ {X} =… <: Top → X \ S ∪ {X}`:

ssub_d_andL: T₁ <: T₁, blocked as before.
ssub_d_andR: X -> Top <: Top -> X requires (contravariant) Top <: X — fails.

T₁ & T₂ <: T₂ \ S ∪ {X} = ... <: X -> Top \ S ∪ {X}:

ssub_d_andL: Top -> X <: X -> Top requires (contravariant) X <: Top ✓, (covariant) X <: Top ✓. ✓
(Or ssub_d_andR: T₂ <: T₂, blocked.)

So one direction works (via “the other branch handles it”), but the first direction (<: T₁) can’t be derived. Hence the intersection’s self-equivalence doesn’t hold.

I’m finding that whenever T₁ has X actively, T₁ & T₂ <: T₁ \ S ∪ {X} is hard to derive without using T₁ ‘s own self-equivalence (which is blocked).

Trying yet another configuration

What if the redirection works via Top?

T₁ = X -> Top (with X contravariantly under Top). Then T₁ ~ T₁ \ S ∪ {X}?

X -> Top <: X -> Top \ S ∪ {X} via ssub_d_arrow: contravariantly X <: X \ S ∪ {X} — blocked.

So even though X is contravariant, the variable rule for X is blocked because we need to compare X reflexively at that position. Hmm.

So T₁ = X -> Top doesn’t satisfy T₁ ~ T₁ \ S ∪ {X}. The reflexivity comparison hits the blocked variable rule regardless of polarity. Interesting.

What types T do satisfy T ~ T \ S ∪ {X}?

T with no occurrence of X: trivially.
T = Top: trivially via ssub_d_top.
T = T_dom -> Top where T_dom may contain X: T <: T \ S ∪ {X} via ssub_d_arrow: contravariantly T_dom <: T_dom \ S ∪ {X} (recurse), covariantly Top <: Top ✓. So we need T_dom ‘s self-equivalence in \ S ∪ {X} — recurse.
T = Top -> T_cod: covariantly T_cod <: T_cod \ S ∪ {X} — recurse on T_cod.

Hmm, but the recursion eventually has to bottom out at the X variable (if X appears) — and reflexivity on X is blocked.

So actually, T ~ T \ S ∪ {X} holds iff X doesn’t appear in T at all, OR X only appears in positions that get absorbed by Top during the derivation of T <: T.

When does an occurrence of X get absorbed? Looking at the rules:

ssub_d_top: absorbs everything on the LHS, including any occurrences of X on the LHS, when the RHS is Top.

So an X -occurrence is “absorbed” in the T <: T derivation when, at some recursive sub-derivation, the corresponding RHS position is Top. But for T <: T, the RHS at every position matches the LHS structurally — so the RHS at X ‘s position is also X, not Top.

Wait — this means T ~ T \ S ∪ {X} actually fails whenever X appears in T at all!

Let me verify: if X appears in T at some path p, then in the derivation of T <: T \ S ∪ {X}, we recurse along path p (via arrows, intersections, records), and at the end we hit X <: X — blocked.

Are there ways to “skip” path p? In the arrow case, no — both contravariant and covariant sub-derivations are required. In the intersection-on-right (ssub_d_and), both branches are required. In the intersection-on-left (ssub_d_andL/R), we can pick a branch — so if X is in T₁ of T = T₁ & T₂, when proving T₁ & T₂ <: T₂_part \ S ∪ {X} for some part T₂_part, we can pick the T₂ branch and skip T₁. But for proving T₁ & T₂ <: T₁_part \ S ∪ {X}, we’d want to pick T₁ to make structural matching work — and then we hit the X in T₁.

Hmm, but maybe we can pick T₂ even when proving T₁ & T₂ <: T₁_part, if T₂ <: T₁_part happens to hold.

This is exactly the case I was trying to construct earlier. Let me try once more.

A potential counterexample

T₁ = X -> Top, T₂ = X -> Top (both branches have X contravariantly). T₁ & T₂ self-equivalence in \ S ∪ {X}?

T₁ & T₂ <: T₁ \ S ∪ {X} = (X -> Top) & (X -> Top) <: X -> Top \ S ∪ {X} (with RHS ordinary).

ssub_d_andL: X -> Top <: X -> Top requires X <: X \ S ∪ {X} — blocked.
ssub_d_andR: X -> Top <: X -> Top — same, blocked.

Still blocked. Hmm.

What if T₁ and T₂ are arrows whose contravariant X is absorbed by Top in the other branch?

Wait, the contravariant subsumption between branches doesn’t help — when we pick ssub_d_andL, only T₁ <: T₁_part is needed; T₂ isn’t involved.

Let me think about this more carefully. Maybe the only way to get T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} is for both T₁ and T₂ to satisfy self-equivalence individually (possibly with the help of the other branch only via subsumption-replacement).

Try: T₁ = X -> Top, T₂ = Top -> Top. Want T₁ & T₂ <: T₁ \ S ∪ {X} = ... <: X -> Top \ S ∪ {X}:

ssub_d_andL: X -> Top <: X -> Top blocked.
ssub_d_andR: Top -> Top <: X -> Top requires (contravariant) X <: Top \ S ∪ {X} ✓ (ssub_d_top!), (covariant) Top <: Top ✓. ✓ Top -> Top <: X -> Top \ S ∪ {X} is derivable.

So T₁ & T₂ <: T₁ \ S ∪ {X} is derivable via ssub_d_andR, using the fact that T₂ = Top -> Top is a subtype of T₁ = X -> Top (with the X -occurrence absorbed by Top contravariantly).

Now T₁ & T₂ <: T₂ \ S ∪ {X} = ... <: Top -> Top \ S ∪ {X}:

ssub_d_andL: X -> Top <: Top -> Top requires (contravariant) Top <: X — fails.
ssub_d_andR: Top -> Top <: Top -> Top ✓.

So T₁ & T₂ <: T₂ \ S ∪ {X} is derivable via ssub_d_andR.

So T₁ & T₂ <: T₁ & T₂ \ S ∪ {X} is derivable! Both decomposed sub-goals work.

Now check the subclaim: T₁ ~ T₁ \ S ∪ {X}?

T₁ = X -> Top. T₁ <: T₁ \ S ∪ {X} via ssub_d_arrow: contravariantly X <: X \ S ∪ {X} — blocked. So T₁ ~ T₁ \ S ∪ {X} fails.

🎯 Counterexample found! With T₁ = X -> Top, T₂ = Top -> Top:

T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} syntactic ✓ (the intersection’s self-equivalence holds).
T₁ ~ T₁ \ S ∪ {X} syntactic ✗ (the component’s self-equivalence fails).

The subclaim is FALSE.

Why this happens

The intersection’s self-equivalence can hold because of cross-branch subsumption: when proving T₁ & T₂ <: T₁ \ S ∪ {X}, we can pick ssub_d_andR and use T₂ <: T₁ if T₂ happens to be a subtype of T₁ in the avoidance system. The active X in T₁ gets absorbed by Top on T₂ ‘s side during the comparison.

But individually, T₁ <: T₁ requires comparing T₁ ‘s X with itself directly — and this is blocked because the variable rule for X is not allowed.

So self-equivalence of intersection ≠ self-equivalence of both branches. The subclaim was wrong.

Implications for Lemma 2’

This breaks the Lemma 2’ proof in:

Case ssub_d_andL for goal (1): I needed A_a ~ A_a \ S ∪ {X} from A_a & A_b ~ A_a & A_b \ S ∪ {X}. Not derivable in general (per the counterexample).

Case ssub_d_and for goal (2): I needed B_a ~ B_a \ S ∪ {X} from B_a & B_b ~ B_a & B_b \ S ∪ {X}. Same issue.

So the proof of Lemma 2’ has a real gap in the intersection cases — the dual formulation handles the contravariant arrow swap nicely, but the intersection inversion doesn’t decompose self-equivalence cleanly.

What this suggests structurally

The “active X absorbed by the other branch” pattern — like T₁ = X -> Top, T₂ = Top -> Top — is a real syntactic pattern that the avoidance system can recognize at the intersection level but not at the individual-branch level.

This means the right notion of “B is X-absorbed” is at the level of the whole type, including its intersection structure, not at the level of components. Decomposition loses information.

Possible responses:

Reformulate Lemma 2’ to avoid needing component self-equivalences. Maybe in the intersection cases, we can use the intersection’s self-equivalence directly, without decomposing.
Strengthen the IH to track when self-equivalence is “partial” — e.g., “either this component has self-equivalence, or another component covers it via subsumption.”
Add a different rule to the avoidance system that captures “X-absorption via intersection coverage” more explicitly.

Let me think about option 1, which seems cleanest.

In the ssub_d_andL case (goal 1): we have A_a <: B \ S (sub-derivation), A_a & A_b ~ A_a & A_b \ S ∪ {X} (premise 1), and we want A_a & A_b <: B \ S ∪ {X}.

Direct route via ssub_d_andL: derive A_a <: B \ S ∪ {X}. By IH (1), this needs A_a ~ A_a \ S ∪ {X} — which we don’t have in general.

Alternative route via ssub_d_andR: derive A_b <: B \ S ∪ {X}. By IH (1), this needs A_b ~ A_b \ S ∪ {X} and A_b <: B \ S (which we don’t directly have either — we only have A_a <: B \ S).

Hmm. Neither direct route works without additional information.

Wait, let me reconsider. We have premise 1 saying A_a & A_b ~ A_a & A_b \ S ∪ {X} syntactic. As established, this might be derivable via cross-branch subsumption — meaning A_b <: A_a \ S ∪ {X} and/or A_a <: A_b \ S ∪ {X} might hold.

If A_b <: A_a \ S ∪ {X}, then by transitivity with A_a <: B \ S ∪ {X} (which we’d derive somehow), we’d get A_b <: B \ S ∪ {X}, and similarly via ssub_d_andR. But this is circular — we’d need A_a <: B \ S ∪ {X} to start.

Cleaner alternative: Maybe the proof’s induction has to be more sophisticated, tracking the derivation of premise 1 alongside premise 2.

This is getting messy. Let me step back.

Honest assessment

You were right to push on the subclaim. The subclaim is false, and this breaks the proof of Lemma 2’ in the intersection cases. Specifically, the “self-equivalence of an intersection implies self-equivalence of components” doesn’t hold in your avoidance system, due to cross-branch subsumption.

The dual formulation of Lemma 2’ worked beautifully for the arrow case but doesn’t handle the intersection case as cleanly. We need a different strategy for intersections.

Some directions to think about:

The right invariant for intersections might involve the whole intersection, not its components. Perhaps a generalized version of “self-equivalence” that’s preserved by structural induction.
Or perhaps the induction should be on something else — not the derivation of premise 2 directly, but on some measure that decomposes intersections more cooperatively.
Or perhaps additional rules in the avoidance system could make decomposition work, by explicitly recognizing the “X is absorbed via the other branch” pattern.

I want to be honest: this is a real obstacle, not just a proof-engineering wrinkle. The cross-branch absorption is a subtle phenomenon that the current syntactic formulation captures at the intersection level but loses on decomposition.

Question back to you: does this kind of cross-branch absorption pattern come up in your other meta-theoretic proofs? If so, how have you handled it elsewhere? And does your paper have other lemmas that decompose intersection self-equivalence — if so, do they impose additional conditions to make it work?

I’d rather understand the existing toolkit before proposing something new.

You

You are right to stop at this point. I should tell you that I’ve come up against a similar situation before, though that was a different scenario. The problem we are facing now is more like the result of my previous attempts at solving this kind of path divergence problem.Just to give you a taste, the problem is when I try to track these variable substitutions independently; then, for different variables, due to the intersection eliminations, they may form different intersection trees of subtyping derivation.So I need to have a track, an explicit set of variables to ensure that they form the simultaneous derivation tree.

Can you come up with some other strategies that solve the and/andr case in the original lemma proof sketch?

I still think B~B avoiding S / X is a good idea and express the property we need for addressing the substitution inversion theorem

Claude

Strategies for the Intersection Case

You’re describing a real and recurring issue: when different variables interact differently with intersection structures, the derivation trees diverge, and the simple inductive structure breaks. Let me think hard about strategies for the ssub_d_and and ssub_d_andL/R cases, keeping B ~ B \ S ∪ {X} as the core hypothesis.

Recap of what’s broken

In the dual formulation Lemma 2’:

ssub_d_and (goal 2, B = B_a & B_b): needed B_a ~ B_a \ S ∪ {X} and B_b ~ B_b \ S ∪ {X}. Not derivable from B_a & B_b ~ B_a & B_b \ S ∪ {X} alone (cross-branch absorption can hold without component self-equivalence).
ssub_d_andL/R (goal 1, A = A_a & A_b): symmetric issue.

Let me explore strategies, ordered roughly by how invasive they are.

Strategy 1: Pass the intersection’s self-equivalence directly to the IH

Idea: Don’t decompose self-equivalence. Instead, apply the IH with the self-equivalence of the whole intersection, and let the IH’s recursion through the derivation handle the structure.

For ssub_d_andL (goal 1): we have A_a & A_b ~ A_a & A_b \ S ∪ {X} and the sub-derivation A_a <: B \ S. We want A_a & A_b <: B \ S ∪ {X}.

What if we apply IH with the “wrong” hypothesis — namely, give the IH the whole A_a & A_b ‘s self-equivalence and the sub-derivation A_a <: B \ S?

But the IH is parameterized by a sub-derivation A' <: B' \ S (where A', B' are the actual LHS/RHS of that sub-derivation). It expects A' ~ A' \ S ∪ {X} (goal 1) or B' ~ B' \ S ∪ {X} (goal 2). We can’t pass A_a & A_b ‘s self-equivalence as the hypothesis for A_a <: B \ S — they’re about different types.

So this naive idea doesn’t work. But there’s a variant: maybe the IH can be relaxed to accept self-equivalence of a containing type. That’s interesting and worth thinking about, but it changes the lemma’s shape significantly.

Strategy 2: Detour through cross-branch subsumption

Idea: When A_a & A_b ~ A_a & A_b \ S ∪ {X} holds without A_a ~ A_a decomposing cleanly, it must be because A_b <: A_a (or vice versa) “covers” A_a ‘s active X -occurrences via cross-branch subsumption. Use this fact directly.

Concretely: from A_a & A_b ~ A_a & A_b \ S ∪ {X}, we have A_a & A_b <: A_a \ S ∪ {X}. With A_a ordinary, this is derived by ssub_d_andL or ssub_d_andR:

(a) ssub_d_andL: A_a <: A_a \ S ∪ {X} — this is A_a ‘s self-equivalence! So we’d be in the easy case.
(b) ssub_d_andR: A_b <: A_a \ S ∪ {X} — cross-branch subsumption.

If case (a), we have A_a ‘s self-equivalence, apply IH (1) directly. ✓

If case (b), we have A_b <: A_a \ S ∪ {X}. Can we use this?

We have sub-derivation A_a <: B \ S. By transitivity (admissible) of \ S ∪ {X} and “soundness” of \ S w.r.t. \ S ∪ {X} (a separate lemma — anything derivable in \ S and respecting X should be derivable in \ S ∪ {X}… but that’s not obvious either, and it’s circular).

Hmm, let me think again.

We have:

A_b <: A_a \ S ∪ {X} (from case (b)).
A_a <: B \ S (sub-derivation in premise 2).

If we could lift A_a <: B \ S to A_a <: B \ S ∪ {X} somehow, then transitivity gives A_b <: B \ S ∪ {X}, and we use ssub_d_andR to derive A_a & A_b <: B \ S ∪ {X}. ✓

But lifting A_a <: B \ S to A_a <: B \ S ∪ {X} is exactly Lemma 2’ applied to a sub-case! It’s circular.

Unless the recursive call’s input is structurally smaller. Let’s see: in case (b), we need to prove A_a <: B \ S ∪ {X} recursively. The sub-derivation A_a <: B \ S has size ≤ original. We’d need self-equivalence of either A_a or B in \ S ∪ {X}. We have… hmm, do we have either?

We have B ‘s status: not specified for goal (1) (we’re proving goal (1), so we don’t have B ‘s self-eq).

We have A_a ‘s status: from case (b) of premise 1’s decomposition, we got A_b <: A_a \ S ∪ {X}, not A_a <: A_a. So we don’t have A_a ‘s self-equivalence directly.

Hmm. We have A_b <: A_a \ S ∪ {X}. What does that buy us? Well, if we had A_a <: A_b \ S ∪ {X} symmetrically (the other inversion direction), we’d have A_a ~ A_b \ S ∪ {X} mutually. By transitivity, A_a <: A_a \ S ∪ {X} from A_a <: A_b <: A_a — wait, that’s circular if we don’t have A_a <: A_b.

Inverting the other direction: A_a & A_b <: A_b \ S ∪ {X} (from premise 1). With A_b ordinary:

ssub_d_andL: A_a <: A_b \ S ∪ {X}.
ssub_d_andR: A_b <: A_b \ S ∪ {X} — A_b ‘s self-equivalence.

If we’re in case (b) for the first decomposition (A_b <: A_a), we don’t know which case we’re in for the second decomposition. Both cases are possible.

Let me try to enumerate:

Subcase (b)-(a): A_b <: A_a and A_b <: A_b. We have A_b ‘s self-equivalence. Subcase (b)-(b): A_b <: A_a and A_a <: A_b. We have A_a ~ A_b \ S ∪ {X} mutually. By transitivity, A_a <: A_a \ S ∪ {X} from A_a <: A_b <: A_a? Wait, we need A_b <: A_a, which we have. And A_a <: A_b, which we have. So by transitivity: A_a <: A_a \ S ∪ {X}. ✓ — A_a ‘s self-eq follows.

Let me verify subcase (b)-(b) more carefully. We have A_a <: A_b \ S ∪ {X} and A_b <: A_a \ S ∪ {X}. By transitivity (admissibility): A_a <: A_a \ S ∪ {X}. ✓

So in subcase (b)-(b), we get A_a ‘s self-equivalence via transitivity! We can apply IH (1) directly.

In subcase (b)-(a), we have A_b <: A_a \ S ∪ {X} and A_b <: A_b \ S ∪ {X}. We can use IH (1) on A_a <: B \ S with… A_a ‘s self-equivalence — which we don’t have. So we’re stuck.

But wait: A_b <: A_a and A_b <: A_b together don’t easily give us A_a <: A_a. Unless we can also derive A_a <: A_b somehow, putting us in subcase (b)-(b).

Let me look at this differently. What if we directly use A_b ‘s self-equivalence and the cross-branch fact A_b <: A_a?

Goal: A_a & A_b <: B \ S ∪ {X}. Decompose via ssub_d_andR to need A_b <: B \ S ∪ {X}.

We have:

A_b <: A_a \ S ∪ {X} (cross-branch).
A_a <: B \ S (sub-derivation).
A_b <: A_b \ S ∪ {X} (subcase (b)-(a) gives A_b ‘s self-eq).

Apply IH (1) to A_a <: B \ S with… we’d need A_a ‘s self-eq, which we don’t have.

Apply IH (2) to A_a <: B \ S with… we’d need B ‘s self-eq, which we don’t have either (proving goal 1).

Stuck. Subcase (b)-(a) doesn’t close. But maybe it’s vacuous — i.e., it doesn’t actually arise.

Is subcase (b)-(a) vacuous?

Subcase (b)-(a): A_b <: A_a \ S ∪ {X} (cross-branch) and A_b <: A_b \ S ∪ {X} (A_b ‘s self-eq), with A_a <: A_b \ S ∪ {X} not required.

Concretely: A_a = X -> Top, A_b = Top -> Top.

A_b <: A_a \ S ∪ {X}: Top -> Top <: X -> Top. Contravariantly X <: Top ✓ (ssub_d_top). Covariantly Top <: Top ✓. ✓
A_b <: A_b \ S ∪ {X}: Top -> Top <: Top -> Top. ✓
A_a <: A_b \ S ∪ {X}: X -> Top <: Top -> Top. Contravariantly Top <: X \ S ∪ {X} — fails (Top isn’t <: of X reflexively, since reflexivity is blocked).

So subcase (b)-(a) genuinely arises with this example. Self-equivalence of A_a & A_b = (X -> Top) & (Top -> Top) in \ S ∪ {X}:

... <: X -> Top \ S ∪ {X}: pick ssub_d_andR with Top -> Top <: X -> Top \ S ∪ {X} ✓.
... <: Top -> Top \ S ∪ {X}: pick ssub_d_andR with Top -> Top <: Top -> Top ✓.

Yes, this is the canonical “X is absorbed through the other branch” pattern. Subcase (b)-(a) is the genuine difficulty.

Stepping back: what’s special about subcase (b)-(a)?

In this case:

A_a has active X (A_a = X -> Top, with X contravariant requiring reflexivity which is blocked).
A_b is fully X -absorbed (A_b = Top -> Top, no X at all).
Cross-branch: A_b <: A_a \ S ∪ {X} (because A_b is “smaller” than A_a, with Top absorbing A_a ‘s X contravariantly).
The intersection’s self-eq holds because A_b covers both the comparison to A_a and to itself.

Goal of Lemma 2’ (1): given A_a <: B \ S (sub-derivation) and A_a & A_b ~ A_a & A_b \ S ∪ {X}, prove A_a & A_b <: B \ S ∪ {X}.

The key insight: in this configuration, A_b <: B \ S ∪ {X} should also hold, because A_b <: A_a and A_a <: B, so by transitivity (lifted to \ S ∪ {X})… but this needs Lemma 2’ applied to A_a <: B, which is what we’re trying to prove. Circular.

However, we can try the chain differently:

A_b <: A_a \ S ∪ {X} ✓ (have).
A_a <: B \ S — sub-derivation. Apply Lemma 2’ with A_a ‘s self-eq… wait we don’t have it.

Hmm. Let me think if A_b <: B \ S ∪ {X} is derivable directly without going through A_a.

Concrete example: A_a = X -> Top, A_b = Top -> Top, sub-derivation A_a <: B \ S for some B. With A_a = X -> Top ordinary and B ordinary: A_a <: B requires B to be an arrow B_dom -> B_cod (or Top). If B = Top: trivial, A_a & A_b <: Top \ S ∪ {X} directly.

Suppose B = B_dom -> B_cod. Then A_a <: B \ S = X -> Top <: B_dom -> B_cod \ S requires B_dom <: X \ S (contravariant) and Top <: B_cod \ S (covariant). The first requires B_dom <: X \ S, which (since X ∉ S) can be derived by ssub_d_var if B_dom = X, or via intersection elimination if B_dom is intersection containing X, or via ssub_d_top if B_cod is on the Top side… wait, we need B_dom <: X, and X is a variable (not Top). The only way is B_dom = X or B_dom contains X actively somehow.

Take B_dom = X (specifically). Then A_a <: B \ S = X -> Top <: X -> Top \ S is derivable in \ S (since X ∉ S). So this is a valid sub-derivation.

Goal: A_a & A_b = (X -> Top) & (Top -> Top) <: X -> Top \ S ∪ {X}. As shown before, this is derivable via ssub_d_andR using A_b <: A_a \ S ∪ {X}. ✓

So actually, in this specific instance, the goal is derivable. The proof doesn’t need Lemma 2’ at all in this case — the goal can be derived directly from A_b <: A_a \ S ∪ {X} (which we extracted from premise 1) via ssub_d_andR with A_b <: B \ S ∪ {X} derived via… hmm wait, we picked ssub_d_andR with A_b <: A_a to handle the goal ... <: A_a. But the goal here is ... <: B, not ... <: A_a.

Let me redo. Goal: A_a & A_b <: B \ S ∪ {X} where B = X -> Top (in our concrete example).

So B = X -> Top = A_a. The goal is A_a & A_b <: A_a \ S ∪ {X} — which is half of A_a & A_b ~ A_a & A_b \ S ∪ {X} (premise 1). So directly from premise 1! ✓

OK in this contrived example, the goal is premise 1. Not informative.

Let me try a different B. B = Top -> Top. Then A_a <: B \ S = X -> Top <: Top -> Top \ S requires Top <: X \ S — fails. So A_a <: B \ S doesn’t hold, can’t be a sub-derivation. Try harder.

B = (X & Top) -> Top (or something with X). A_a <: B \ S = X -> Top <: (X & Top) -> Top \ S requires X & Top <: X \ S (contravariant). X & Top <: X via ssub_d_andL: X <: X \ S ✓ (X ∉ S). So A_a <: B \ S ✓.

Goal: A_a & A_b <: B \ S ∪ {X} = (X -> Top) & (Top -> Top) <: (X & Top) -> Top \ S ∪ {X}. With B = (X & Top) -> Top ordinary (it’s an arrow).

Pick ssub_d_andL: X -> Top <: (X & Top) -> Top \ S ∪ {X} requires X & Top <: X \ S ∪ {X} (contravariant). X & Top <: X: via ssub_d_andL, need X <: X \ S ∪ {X} — blocked. Via ssub_d_andR, need Top <: X \ S ∪ {X} — fails.

Pick ssub_d_andR: Top -> Top <: (X & Top) -> Top \ S ∪ {X} requires X & Top <: Top \ S ∪ {X} (contravariant). ✓ via ssub_d_top. Covariant: Top <: Top ✓. ✓

So the goal is derivable via ssub_d_andR! Even though Lemma 2’ (1)‘s direct strategy via ssub_d_andL (using A_a <: B \ S ∪ {X} from IH) is blocked.

🎯 Key insight: In the intersection case, the correct derivation of the goal might use a different branch than the sub-derivation in premise 2! The sub-derivation says A_a <: B \ S, but the goal A_a & A_b <: B \ S ∪ {X} might need ssub_d_andR (using A_b <: B \ S ∪ {X}) instead.

This is the structural mismatch: when adding X to the avoidance set, the “right” intersection branch for the derivation may shift.

Strategy 3: Allow strategy-shift in the intersection case

New strategy: In the ssub_d_andL case for goal (1), instead of trying to derive A_a <: B \ S ∪ {X} (which may not hold), try to derive A_a & A_b <: B \ S ∪ {X} directly using whichever strategy works, based on the structure of premise 1.

Specifically: from A_a & A_b ~ A_a & A_b \ S ∪ {X}, we have A_a & A_b <: A_a \ S ∪ {X} and A_a & A_b <: A_b \ S ∪ {X}. Combined with the sub-derivation A_a <: B \ S:

If we can derive A_a <: B \ S ∪ {X} (e.g., via IH if A_a has self-eq) OR A_b <: B \ S ∪ {X} (somehow), then ssub_d_andL or ssub_d_andR closes the goal.

In the canonical example (A_a = X -> Top, A_b = Top -> Top, B = (X & Top) -> Top): A_a <: B \ S ∪ {X} is blocked, but A_b <: B \ S ∪ {X} is derivable directly because A_b = Top -> Top is X -free. Specifically, since A_b has no free X, A_b <: B \ S lifts trivially to A_b <: B \ S ∪ {X} (the avoidance set just needs to not contain A_b ‘s free vars, which it doesn’t).

But wait — we need a sub-derivation A_b <: B \ S to start. We don’t have that directly; we have A_a <: B \ S.

Can we derive A_b <: B \ S from what we have? We have A_b <: A_a \ S ∪ {X} (from premise 1). Restricting to \ S: A_b <: A_a \ S (since \ S ∪ {X} is more restrictive than \ S, so derivable in the former implies derivable in the latter — assuming this monotonicity holds). And A_a <: B \ S (sub-derivation). By transitivity (admissibility): A_b <: B \ S. ✓

Now lift A_b <: B \ S to A_b <: B \ S ∪ {X}: this is exactly Lemma 2’ (1) with A_b ~ A_b \ S ∪ {X} as the hypothesis. We have A_b ~ A_b \ S ∪ {X} from subcase (b)-(a) of premise 1’s inversion!

So: apply Lemma 2’ (1) recursively to A_b <: B \ S with hypothesis A_b ~ A_b \ S ∪ {X}. Is the recursive call structurally smaller?

Hmm. The sub-derivation we constructed (A_b <: B \ S) is via transitivity of A_b <: A_a \ S and A_a <: B \ S. Its size is roughly the sum of these two derivations’ sizes — not necessarily smaller than the original.

This is a problem for termination of the recursion.

Revised strategy: need a stronger induction measure

The measure on the proof needs to handle this. Maybe induction on the structure of types (A + B ‘s size), not on the derivation, so that recursive calls on smaller types terminate even if the derivation grows.

In our case: A_b <: B \ S ∪ {X} is a smaller-type goal? |A_b| + |B| < |A_a & A_b| + |B| — yes, since |A_a & A_b| = |A_a| + |A_b| + 1. ✓

So if Lemma 2’ is proven by induction on |A| + |B| (type sizes), the recursive call on A_b <: B \ S ∪ {X} is on strictly smaller types and the induction terminates.

This is structurally cleaner. Let me re-examine whether the whole proof goes through with this measure.

Proof outline with type-size induction

Induct on |A| + |B|. Cases by the last rule of A <: B \ S (premise 2).

In the ssub_d_andL case for goal (1): A = A_a & A_b, sub-derivation A_a <: B \ S. Premise 1: A_a & A_b ~ A_a & A_b \ S ∪ {X}.

Decompose premise 1:

A_a & A_b <: A_a \ S ∪ {X}: case (a) A_a <: A_a \ S ∪ {X} or case (b) A_b <: A_a \ S ∪ {X}.
A_a & A_b <: A_b \ S ∪ {X}: case (a’) A_a <: A_b \ S ∪ {X} or case (b’) A_b <: A_b \ S ∪ {X}.

Subcase (a): A_a ~ A_a \ S ∪ {X} directly (case (a) of first decomposition gives A_a <: A_a; we’d also need the reverse, but for self-equivalence we usually need both directions — hmm, let me re-examine).

Actually for self-equivalence A_a ~ A_a \ S ∪ {X} we need A_a <: A_a in both directions, but mutual subtyping with the same type just needs one direction (it’s symmetric trivially). Wait no, A_a ~ A_a means A_a <: A_a and A_a <: A_a — same condition twice. So A_a ~ A_a is just A_a <: A_a.

OK so subcase (a): A_a <: A_a \ S ∪ {X}, i.e., A_a ‘s self-equivalence. Apply IH on sub-derivation A_a <: B \ S (size: |A_a| + |B| < |A_a & A_b| + |B|, so smaller in our type-size measure). IH gives A_a <: B \ S ∪ {X}. Apply ssub_d_andL to conclude A_a & A_b <: B \ S ∪ {X}. ✓

Subcase (b): A_b <: A_a \ S ∪ {X}, with A_a ‘s self-eq not directly derivable.

By the second decomposition of premise 1: case (a’) A_a <: A_b \ S ∪ {X} or case (b’) A_b <: A_b \ S ∪ {X}.

Subcase (b)-(a’): Both A_b <: A_a and A_a <: A_b. By transitivity: A_a <: A_a \ S ∪ {X} (composing A_a <: A_b <: A_a). So A_a ‘s self-eq derivable. Same as subcase (a). ✓

Subcase (b)-(b’): A_b <: A_a and A_b <: A_b. Have A_b ‘s self-eq, no A_a self-eq.

Construct A_b <: B \ S: from A_b <: A_a \ S ∪ {X} (which implies A_b <: A_a \ S by avoidance-set-shrinking — assuming this monotonicity lemma holds) and A_a <: B \ S (sub-derivation), by transitivity A_b <: B \ S. Size of A_b <: B: |A_b| + |B| < |A_a & A_b| + |B|. ✓

Apply IH (1) to A_b <: B \ S with hypothesis A_b ~ A_b \ S ∪ {X} (subcase (b’)). Get A_b <: B \ S ∪ {X}. Apply ssub_d_andR to conclude A_a & A_b <: B \ S ∪ {X}. ✓

So all subcases close, provided we have:

Type-size induction (rather than derivation-size).
Avoidance-set monotonicity: A <: B \ S ∪ {X} implies A <: B \ S (assuming \ S ∪ {X} is more restrictive, this should be admissible).
Transitivity of \ S.

This is a real proof strategy that avoids the broken subclaim. It uses the cross-branch subsumption structurally rather than trying to decompose self-equivalence.

What about goal (2) in the ssub_d_and case?

B = B_a & B_b, sub-derivations A <: B_a \ S and A <: B_b \ S. Premise 1: B_a & B_b ~ B_a & B_b \ S ∪ {X}.

Want: A <: B_a & B_b \ S ∪ {X} via ssub_d_and from A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}.

For A <: B_a \ S ∪ {X}: apply IH (2) to A <: B_a \ S with B_a ~ B_a \ S ∪ {X}. Need B_a ‘s self-eq.

From premise 1: B_a & B_b <: B_a \ S ∪ {X}. By the analogous case analysis on this and on B_a & B_b <: B_b \ S ∪ {X}, we get either B_a ‘s self-eq or a cross-branch chain that lets us proceed via B_b instead.

This is symmetric to goal (1) for ssub_d_andL. Same strategy: case-analyze on which branch of premise 1 wins, use the cross-branch chain when needed, use type-size induction.

Summary

I believe the proof goes through with this strategy:

Induct on |A| + |B| (types’ total size), not derivation size.
In intersection cases, case-analyze on premise 1’s decomposition to determine which branch has self-equivalence.
When neither branch has self-equivalence directly but cross-branch subsumption holds, construct a smaller derivation via transitivity and apply IH on a different sub-derivation than the one in premise 2.
Use ssub_d_andL or ssub_d_andR adaptively based on which branch the IH succeeds for.

The dual formulation of Lemma 2’ (goals 1 and 2 simultaneously) handles arrow contravariance.

Type-size induction allows recursion on derivations constructed via transitivity (which might be larger than the original) as long as the types involved are smaller.

Caveats

Auxiliary lemmas needed:
- Avoidance-set monotonicity (<: \ S ∪ {X} implies <: \ S).
  - Transitivity of <: \ S (admissibility).
  - These are standard, but need verification in your formalization.
The case analysis on premise 1’s decomposition assumes the avoidance system has good inversion properties — i.e., that the only ways to derive A_a & A_b <: A_a \ S ∪ {X} (with A_a ordinary) are ssub_d_andL or ssub_d_andR. This should hold by your rules.
What if A_a is itself an intersection (not ordinary)? The rules ssub_d_andL/R require the RHS to be ordinary. So if A_a is an intersection, A_a & A_b <: A_a \ S ∪ {X} decomposes via ssub_d_and (right intersection introduction). I haven’t worked out this nested case carefully.
The whole approach hinges on being able to construct a smaller derivation via transitivity. If the types involved in the cross-branch fact (A_b <: A_a) are not smaller than the original goal, the induction might not terminate. With type-size measure |A_a & A_b| + |B| and recursing on |A_b| + |B|, this is fine. But if instead we recurse on |A_a| + |A_b| somehow, things could break. Worth checking carefully.

I think this is a workable strategy. Want me to flesh out the goal (2) case symmetrically, or work through the nested intersection case (caveat 3)?

You

So the subclaim would hold? By applying the proof you inlined in the case analysis? Provided that the transitivity modulo S holds

Claude

Re-examining the Subclaim

Let me check this carefully — you’re right to ask. The subclaim was:

Subclaim: T₁ & T₂ ~ T₁ & T₂ \ S ∪ {X} implies T₁ ~ T₁ \ S ∪ {X} and T₂ ~ T₂ \ S ∪ {X}.

And I gave the counterexample T₁ = X -> Top, T₂ = Top -> Top: the intersection’s self-equivalence holds (via cross-branch subsumption), but T₁ ~ T₁ \ S ∪ {X} fails (because reflexivity on X is blocked).

So the subclaim is genuinely false — transitivity doesn’t rescue it.

What the case analysis actually gave us

Looking back at what I did in the intersection case of Lemma 2’, I went through subcases of how T₁ & T₂ <: T₁ \ S ∪ {X} and T₁ & T₂ <: T₂ \ S ∪ {X} get derived:

Subcase (a) for first: T₁ <: T₁ \ S ∪ {X} directly — T₁ ‘s self-eq.
Subcase (b) for first: T₂ <: T₁ \ S ∪ {X} — cross-branch.

And similarly for the second.

In subcase (b)-(a’) (cross-branch both ways), transitivity gives T₁ <: T₁, so T₁ ‘s self-eq holds.

In subcase (b)-(b’) (T₂ <: T₁ cross-branch and T₂ <: T₂ self), we have T₂ ‘s self-eq but not T₁ ‘s self-eq.

The counterexample I gave (T₁ = X -> Top, T₂ = Top -> Top) sits in subcase (b)-(b’): T₂ <: T₁ (cross-branch) and T₂ <: T₂ (self), but no T₁ <: T₁ and no T₁ <: T₂.

So the subclaim fails in subcase (b)-(b’), and transitivity doesn’t help here because we don’t have a chain through T₁.

What did go through in the Lemma 2’ proof

What worked in the Lemma 2’ proof was not the subclaim, but a different observation: in subcase (b)-(b’), we don’t need T₁ ‘s self-eq. Instead, we use T₂ ‘s self-eq (which we have) and construct a smaller sub-derivation T₂ <: B \ S via transitivity, then apply IH on that smaller goal using ssub_d_andR instead of ssub_d_andL.

So the Lemma 2’ proof avoids the subclaim entirely by switching which branch is used in the conclusion’s derivation. The cross-branch structure in premise 1 is matched by an analogous cross-branch structure in the conclusion’s derivation.

To answer your direct question

Q: Does the subclaim hold given transitivity modulo S?

No. Transitivity helps in subcase (b)-(a’) but not in subcase (b)-(b’). The counterexample T₁ = X -> Top, T₂ = Top -> Top lives in subcase (b)-(b’) and genuinely violates the subclaim regardless of transitivity.

What transitivity does give you is enough structure to make Lemma 2’ itself go through, by routing the proof through whichever branch has self-equivalence and using transitivity to construct the appropriate sub-derivation.

So the situation is:

Subclaim: false.
Lemma 2’ (with the strategy I sketched): holds, despite the subclaim being false, because we can adaptively choose which intersection rule to use in the conclusion based on premise 1’s decomposition.

A cleaner way to think about it

The subclaim was an attempt to decompose the intersection’s self-equivalence into component self-equivalences. That decomposition fails because cross-branch subsumption is real.

The successful strategy doesn’t decompose. Instead, it mirrors premise 1’s branch structure into the conclusion’s branch structure:

If premise 1 is “self-equivalence holds via the left branch covering itself,” we conclude via ssub_d_andL using the left branch.
If premise 1 is “self-equivalence holds via the right branch covering both,” we conclude via ssub_d_andR using the right branch (possibly with a transitivity step to get the right sub-derivation).

This mirroring is the right structural intuition, and it’s what makes the proof go through without needing the subclaim to hold.

So you should drop the subclaim and instead think of the intersection case as “case-analyze premise 1, mirror its branching into the conclusion.”

Caveats

I want to flag that I’ve sketched this strategy informally and haven’t rigorously verified all subcases — especially the nested intersection case (where T₁ itself is an intersection) and the goal (2) symmetric case. Before committing fully:

Rigorously enumerate the cases of premise 1’s decomposition, including when T₁ or T₂ is itself non-ordinary (intersection).
Verify the type-size induction measure works for the recursive calls, especially when transitivity is used to construct sub-derivations. The crucial property is that the types in the recursive call’s goal are strictly smaller than the original.
Confirm transitivity (modulo S) and avoidance-set monotonicity are admissible in your specific rule set. I’ve assumed these throughout; if they fail or have subtle conditions, the strategy might need adjustment.
The subclaim’s failure is a structural feature, not a bug — the cross-branch absorption pattern is real and meaningful. Any proof strategy must respect it, which is why the mirroring approach is the right one.

Want me to work through the nested intersection case carefully, or rigorously write out the goal (2) symmetric case?

You

I am not much worried about.3 and four and I think for point.2 you can just assume a standard way of calculating the size of types where each constructor will plus one and by induction on the inner components of the type, but I do need you to carefully check.1 which I need you to state the theorem clearly and the workout each case carefully based on the proof idea we have discussed

Claude

Lemma 2’: Detailed Case Analysis

Let me state the lemma carefully and work through each case.

Statement

Lemma 2’. Suppose A <: B \ S syntactic, and X ∉ S. Then: (1) If A ~ A \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic. (2) If B ~ B \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic.

Proof: By strong induction on |A| + |B| where |·| is the standard structural size.

Auxiliary tools assumed admissible:

Transitivity (modulo S): A <: B \ S and B <: C \ S imply A <: C \ S.
Avoidance-set monotonicity: A <: B \ S ∪ {X} implies A <: B \ S (when X ∉ S, so S ⊆ S ∪ {X}).

We prove (1) and (2) simultaneously, case-analyzing on the last rule of A <: B \ S.

Case ssub_d_nat

A = B = nat. Conclusion: nat <: nat \ S ∪ {X} by ssub_d_nat. ✓ both goals.

Case ssub_d_top

B = Top. Conclusion: A <: Top \ S ∪ {X} by ssub_d_top (need lc_typ A ✓). ✓ both goals.

Case ssub_d_var

A = B = typ_fvar Y, Y ∉ S. Want Y <: Y \ S ∪ {X}, requiring Y ∉ S ∪ {X}, i.e., Y ≠ X.

Goal (1): Premise Y ~ Y \ S ∪ {X} derived by ssub_d_var requires Y ∉ S ∪ {X}, so Y ≠ X. ✓

Goal (2): Symmetric. ✓

Case ssub_d_arrow

A = A_dom -> A_cod, B = B_dom -> B_cod. Sub-derivations:

(sub-contra) B_dom <: A_dom \ S
(sub-cov) A_cod <: B_cod \ S

Want A <: B \ S ∪ {X} via ssub_d_arrow from B_dom <: A_dom \ S ∪ {X} and A_cod <: B_cod \ S ∪ {X}.

Goal (1): Premise (A_dom -> A_cod) ~ (A_dom -> A_cod) \ S ∪ {X}. Inversion via ssub_d_arrow: gives A_dom <: A_dom \ S ∪ {X} (contravariant — but for self-equivalence of the same type, this is just self-eq) and A_cod <: A_cod \ S ∪ {X}.

(sub-contra) B_dom <: A_dom \ S: sizes |B_dom| + |A_dom| < |A| + |B| ✓. RHS is A_dom, has self-eq. Apply IH (2): gets B_dom <: A_dom \ S ∪ {X}. ✓
(sub-cov) A_cod <: B_cod \ S: sizes |A_cod| + |B_cod| < |A| + |B| ✓. LHS is A_cod, has self-eq. Apply IH (1): gets A_cod <: B_cod \ S ∪ {X}. ✓
Combine via ssub_d_arrow. ✓

Goal (2): Premise (B_dom -> B_cod) ~ (B_dom -> B_cod) \ S ∪ {X}. Inversion gives B_dom ‘s and B_cod ‘s self-eqs.

(sub-contra) B_dom <: A_dom \ S: LHS is B_dom, has self-eq. Apply IH (1). ✓
(sub-cov) A_cod <: B_cod \ S: RHS is B_cod, has self-eq. Apply IH (2). ✓
Combine. ✓

Case ssub_d_rcd

A = typ_rcd l A', B = typ_rcd l B'. Sub-derivation A' <: B' \ S.

Want typ_rcd l A' <: typ_rcd l B' \ S ∪ {X} via ssub_d_rcd from A' <: B' \ S ∪ {X}.

Goal (1): Premise typ_rcd l A' ~ typ_rcd l A' \ S ∪ {X}. Inversion via ssub_d_rcd: gives A' <: A' \ S ∪ {X} (self-eq of A'). Apply IH (1) to A' <: B' \ S with A' ‘s self-eq: gets A' <: B' \ S ∪ {X}. Apply ssub_d_rcd. ✓

Goal (2): Symmetric, with B' ‘s self-eq. ✓

Case ssub_d_and (intersection introduction on right)

B = B_a & B_b, sub-derivations:

(sub-a) A <: B_a \ S
(sub-b) A <: B_b \ S

Want A <: B_a & B_b \ S ∪ {X} via ssub_d_and from A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}.

Goal (1): Premise A ~ A \ S ∪ {X}. We have A ‘s self-eq, unchanged.

IH (1) on (sub-a): A ‘s self-eq ✓, smaller types ✓. Get A <: B_a \ S ∪ {X}.
IH (1) on (sub-b): similarly. Get A <: B_b \ S ∪ {X}.
Combine via ssub_d_and. ✓

Goal (2): Premise B_a & B_b ~ B_a & B_b \ S ∪ {X}. We need A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X}. We do not in general have B_a ‘s or B_b ‘s self-eq — that was the broken subclaim.

Strategy: case-analyze on the derivation of premise 1’s decompositions.

From premise 1, both directions of <: hold:

B_a & B_b <: B_a & B_b \ S ∪ {X}: this is the <: direction of the equivalence (and ~ here is just <: since it’s the same type on both sides).

Actually wait — ~ is mutual <:, but for self-equivalence T ~ T, it’s the same as T <: T. So premise 1 is just B_a & B_b <: B_a & B_b \ S ∪ {X}.

Inversion of B_a & B_b <: B_a & B_b \ S ∪ {X}: the RHS is an intersection, so ssub_d_and applies, giving:

B_a & B_b <: B_a \ S ∪ {X}… call this (P1a)
B_a & B_b <: B_b \ S ∪ {X}… call this (P1b)

Sub-goal: derive A <: B_a \ S ∪ {X}.

Inversion of (P1a) [B_a & B_b <: B_a \ S ∪ {X}] depends on whether B_a is ordinary.

Subcase B_a ordinary: the rule must be ssub_d_andL or ssub_d_andR.

(P1a)-via-andL: B_a <: B_a \ S ∪ {X} — B_a ‘s self-eq. Apply IH (2) to (sub-a) with B_a ‘s self-eq: get A <: B_a \ S ∪ {X}. ✓
(P1a)-via-andR: B_b <: B_a \ S ∪ {X} — cross-branch. Need to derive A <: B_a \ S ∪ {X} via a different route.

For the cross-branch case:

We have B_b <: B_a \ S ∪ {X} (cross-branch).
We have (sub-b): A <: B_b \ S (sub-derivation).
Now we need B_b ‘s self-eq to apply IH (2) on (sub-b). Look at (P1b).
Inversion of (P1b) [B_a & B_b <: B_b \ S ∪ {X}, B_b ordinary]:
- via-andL: B_a <: B_b \ S ∪ {X} — combined with B_b <: B_a from above, by transitivity B_a <: B_a \ S ∪ {X}. So B_a ‘s self-eq holds, contradicting our sub-subcase. Wait, this just means we’re back to the andL case for (P1a) — so we should have gone through that. Actually, both ssub_d_andL and ssub_d_andR can derive the same goal sometimes; we just need one. So if (P1a)-via-andR gives us B_b <: B_a and then through the chain we get B_a <: B_a, we have B_a’s self-eq, can apply IH (2) on (sub-a). ✓
  - via-andR: B_b <: B_b \ S ∪ {X} — B_b ‘s self-eq. Apply IH (2) to (sub-b) with B_b ‘s self-eq: get A <: B_b \ S ∪ {X}. By transitivity (modulo S ∪ {X}) with B_b <: B_a \ S ∪ {X} (cross-branch from (P1a)): A <: B_a \ S ∪ {X}. ✓

So all sub-subcases of subcase “B_a ordinary” close. ✓

Subcase B_a non-ordinary (i.e., B_a is itself an intersection): the rule for (P1a) cannot be ssub_d_andL/R (those require ordinary RHS). It must be ssub_d_and, decomposing the RHS B_a further. Then we recurse on B_a ‘s structure.

Hmm wait, B_a being an intersection means premise 2’s sub-derivation (sub-a) A <: B_a \ S would also have its last rule as ssub_d_and (since B_a is intersection on the right). That’s a different sub-derivation structure.

Let me just note: if B_a = B_aa & B_ab, then (P1a) decomposes via ssub_d_and to B_a & B_b <: B_aa \ S ∪ {X} and B_a & B_b <: B_ab \ S ∪ {X}. And (sub-a) decomposes to A <: B_aa \ S and A <: B_ab \ S. Now we have new sub-goals: A <: B_aa \ S ∪ {X} and A <: B_ab \ S ∪ {X}.

But these are smaller (the types B_aa, B_ab are smaller than B_a). The original goal A <: B_a & B_b \ S ∪ {X} decomposes by ssub_d_and into A <: B_a \ S ∪ {X} (which decomposes further) and A <: B_b \ S ∪ {X}. Each leaf of the decomposition is a smaller-types subgoal that the IH handles.

Concretely: by induction on type-size, the subgoal A <: B_aa \ S ∪ {X} is |A| + |B_aa| < |A| + |B| (since |B_aa| < |B_a| < |B|). ✓ smaller.

So we can recurse on the non-ordinary case by decomposing the intersection further, eventually reaching ordinary B_a leaves.

Actually wait — I need to be careful. When B_a is non-ordinary, we don’t directly apply Lemma 2’ on (sub-a) A <: B_a \ S. Instead, we further decompose. The decomposition is a meta-level argument, not a direct IH call.

Let me restructure: in the ssub_d_and case for goal (2), for each branch of B, repeatedly decompose intersections until we hit ordinary leaves, then apply the subcase-B_a-ordinary analysis above. The induction terminates because each decomposition reduces type size.

Cleaner statement: in the ssub_d_and case, the IH applies to (sub-a) and (sub-b) directly because they have smaller |A| + |B'| (where B' is B_a or B_b). The IH gives us A <: B_a \ S ∪ {X} and A <: B_b \ S ∪ {X} if we can supply self-eq for B_a and B_b.

Hmm, but we might not have B_a ‘s self-eq directly. So the IH for (2) doesn’t fire cleanly.

Let me back up. The direct IH (2) call needs B_a ~ B_a \ S ∪ {X} as hypothesis. The cross-branch analysis is what gets us this (or, alternatively, gets us the goal via a different route).

So the case analysis I did above (subcase “B_a ordinary”) is the key, and the non-ordinary case reduces to the ordinary case via further decomposition.

Case ssub_d_andL (intersection elimination on left, branch a)

A = A_a & A_b, B ordinary. Sub-derivation:

(sub) A_a <: B \ S

Want A_a & A_b <: B \ S ∪ {X} via ssub_d_andL (with A_a <: B \ S ∪ {X} and lc_typ A_b, ord B) or via ssub_d_andR (with A_b <: B \ S ∪ {X}).

Goal (1): Premise A_a & A_b ~ A_a & A_b \ S ∪ {X}, which decomposes (as before) into:

(P1a) A_a & A_b <: A_a \ S ∪ {X}
(P1b) A_a & A_b <: A_b \ S ∪ {X}

(And also the reverse direction A_a & A_b <: A_a & A_b \ S ∪ {X} derived by ssub_d_and from these — but that’s the same as having (P1a) and (P1b).)

Subcase A_a ordinary (the typical case for ssub_d_andL to apply on premise 2):

Inversion of (P1a) [A_a & A_b <: A_a \ S ∪ {X}, A_a ordinary]:

via-andL: A_a <: A_a \ S ∪ {X} — A_a ‘s self-eq.
via-andR: A_b <: A_a \ S ∪ {X} — cross-branch.

Sub-subcase via-andL: A_a ‘s self-eq directly. Apply IH (1) to (sub) with A_a ‘s self-eq: get A_a <: B \ S ∪ {X}. Apply ssub_d_andL (need lc_typ A_b ✓ from premise 2’s ssub_d_andL, ord B ✓). ✓

Sub-subcase via-andR: A_b <: A_a \ S ∪ {X} cross-branch. Now look at (P1b).

Inversion of (P1b) [A_a & A_b <: A_b \ S ∪ {X}, with A_b ‘s ordinary status TBD]:

Sub-subcase A_b ordinary:

via-andL: A_a <: A_b \ S ∪ {X}. Combined with A_b <: A_a: by transitivity, A_a <: A_a \ S ∪ {X} — A_a ‘s self-eq. Reduce to via-andL above. ✓
via-andR: A_b <: A_b \ S ∪ {X} — A_b ‘s self-eq.

For the (via-andR, A_b ordinary, A_b’s self-eq) sub-subcase:

We have A_b <: A_a \ S ∪ {X} (cross-branch, from (P1a) via-andR).
Avoidance-set monotonicity gives A_b <: A_a \ S.
Combined with (sub) A_a <: B \ S by transitivity: A_b <: B \ S.
Sizes: |A_b| + |B| < |A_a & A_b| + |B| ✓ smaller.
Apply IH (1) to A_b <: B \ S with A_b ‘s self-eq: get A_b <: B \ S ∪ {X}.
Apply ssub_d_andR (need lc_typ A_a and ord B) to get A_a & A_b <: B \ S ∪ {X}. ✓

Sub-subcase A_b non-ordinary: A_b = A_ba & A_bb (intersection). Then (P1b) is A_a & A_b <: A_ba & A_bb \ S ∪ {X}, derived by ssub_d_and (the only rule for intersection on the right) from:

A_a & A_b <: A_ba \ S ∪ {X}… (P1b-1)
A_a & A_b <: A_bb \ S ∪ {X}… (P1b-2)

This is a structural decomposition. We don’t directly get A_b ‘s self-eq from this — but we can reason as follows.

For each of (P1b-1) and (P1b-2), we recurse on the inversion (since A_ba, A_bb are smaller than A_b). By recursion, we’ll eventually hit ordinary types and apply the analysis above.

Hmm, but this recursion is on the structure of A_b, not a direct IH call. It’s a meta-level inductive argument inside the case analysis. Let me think about whether this is well-founded.

Actually, the issue is that we’re trying to derive A_b ‘s self-eq (or sufficient information) from premise 1’s decomposition when A_b is non-ordinary. The decomposition gives us facts like A_a & A_b <: A_ba \ S ∪ {X}, not directly A_b <: A_b.

Let me approach this differently. Instead of trying to derive A_b ‘s self-eq, let me look at what (P1b-1) and (P1b-2) tell us about A_b.

(P1b-1): A_a & A_b <: A_ba \ S ∪ {X}. By inversion (assuming A_ba ordinary; if not, recurse):

via-andL: A_a <: A_ba \ S ∪ {X}.
via-andR: A_b <: A_ba \ S ∪ {X}.

(P1b-2): similarly.

If both (P1b-1) and (P1b-2) go via-andR, we have A_b <: A_ba and A_b <: A_bb, hence A_b <: A_ba & A_bb = A_b \ S ∪ {X} by ssub_d_and. So A_b ‘s self-eq derivable in this sub-subsubcase. ✓

If one or both go via-andL, we get A_a <: A_ba or A_a <: A_bb. Combined with cross-branch A_b <: A_a (from (P1a) via-andR), by transitivity we get A_b <: A_ba or A_b <: A_bb from below. Hmm wait, no: A_b <: A_a <: A_ba gives A_b <: A_ba. ✓. So if (P1b-1) goes via-andL giving A_a <: A_ba, transitivity with A_b <: A_a gives A_b <: A_ba. Similarly for (P1b-2).

So in all sub-subsubcases, we can derive A_b <: A_ba \ S ∪ {X} and A_b <: A_bb \ S ∪ {X} (using transitivity through A_a if needed). Hence A_b <: A_b \ S ∪ {X} by ssub_d_and. A_b ‘s self-eq derivable. ✓

Then proceed as in the ordinary case: chain A_b <: A_a <: B \ S by transitivity to get A_b <: B \ S, apply IH (1) with A_b ‘s self-eq, conclude via ssub_d_andR. ✓

Subcase A_a non-ordinary: A_a = A_aa & A_ab (intersection). But premise 2’s ssub_d_andL requires B ordinary, which is fine, but the sub-derivation is A_a <: B \ S with A_a an intersection.

Hmm wait — in ssub_d_andL, the LHS of the conclusion is A_a & A_b, and the LHS of the sub-derivation is just A_a. There’s no constraint that A_a is ordinary. So A_a could itself be an intersection.

If A_a = A_aa & A_ab, then (sub) A_a <: B \ S is A_aa & A_ab <: B \ S. By inversion of premise 2’s last rule (which is ssub_d_andL on the outermost intersection), the sub-derivation has its own structure — it could itself end in ssub_d_andL /R, etc. But that’s an inversion of the sub-derivation, not premise 2.

The IH applies to the sub-derivation, which has type sizes |A_a| + |B| < |A_a & A_b| + |B|. ✓ smaller.

So even if A_a is non-ordinary, the IH applies to (sub) directly with smaller types. We just need self-eq of A_a (for goal 1).

Inversion of (P1a) when A_a is non-ordinary: (P1a) is A_a & A_b <: A_a \ S ∪ {X} = A_a & A_b <: A_aa & A_ab \ S ∪ {X}. Derived by ssub_d_and (the only rule for intersection on the right):

A_a & A_b <: A_aa \ S ∪ {X}
A_a & A_b <: A_ab \ S ∪ {X}

By inversion (assuming A_aa, A_ab ordinary; recurse otherwise), each gives via-andL or via-andR, and by similar analysis to the ordinary case, we can derive A_a <: A_aa \ S ∪ {X} and A_a <: A_ab \ S ∪ {X} (possibly via transitivity through A_b cross-branch). Hence A_a <: A_a = A_aa & A_ab \ S ∪ {X} by ssub_d_and. A_a ‘s self-eq derivable. ✓

Wait, but the cross-branch case here gives A_b <: A_aa or A_b <: A_ab, and we want A_a <: A_aa. The chain is A_a <: ? — we need A_a <: A_aa.

From (P1a-1) [A_a & A_b <: A_aa \ S ∪ {X}, A_aa ordinary]:

via-andL: A_a <: A_aa \ S ∪ {X}. ✓ direct.
via-andR: A_b <: A_aa \ S ∪ {X}. Need to upgrade to A_a <: A_aa. Hmm, is there a chain?

If we know A_a <: A_b \ S ∪ {X} (the other cross-branch direction), we’d chain: A_a <: A_b <: A_aa. Do we have A_a <: A_b?

This comes from (P1b)‘s decomposition. Looking at (P1b): A_a & A_b <: A_b \ S ∪ {X}. Decomposing (depending on whether A_b ordinary or not).

OK this is getting into deeper case nesting. Let me try to abstract: the case analysis is structurally finite — each level of case-splitting reduces some type’s size — so it terminates. At each leaf, either we have direct self-eq or we derive it via a transitivity chain through cross-branch facts.

Let me see if there’s a counterexample where this analysis genuinely fails. Suppose A_a = X -> Top (active X, no self-eq directly), A_b = Top -> Top (no X, has self-eq). We’re in subcase via-andR for (P1a) (cross-branch A_b <: A_a).

To derive A_a ‘s self-eq, we’d need… hmm wait, do we need it? Let me re-examine the goal in this configuration.

Goal: A_a & A_b <: B \ S ∪ {X}. Sub-derivation (sub): A_a <: B \ S.

Concrete instance: A_a = X -> Top, B = X -> Top. Then (sub) X -> Top <: X -> Top \ S ✓ (since X ∉ S).

Goal: (X -> Top) & (Top -> Top) <: X -> Top \ S ∪ {X}. Try ssub_d_andL: X -> Top <: X -> Top \ S ∪ {X} requires X <: X \ S ∪ {X} — blocked. Try ssub_d_andR: Top -> Top <: X -> Top \ S ∪ {X} requires X <: Top \ S ∪ {X} ✓ (ssub_d_top!), Top <: Top ✓. ✓

So the goal is derivable via ssub_d_andR! Match this to the case analysis: subcase via-andR for (P1a), sub-subcase A_b ordinary, via-andR for (P1b) (since A_b = Top -> Top self-eq holds). We have A_b <: A_a \ S ∪ {X} (from (P1a)) and A_b <: A_b \ S ∪ {X} (self-eq).

Construct A_b <: B \ S: monotonicity on A_b <: A_a \ S ∪ {X} gives A_b <: A_a \ S; transitivity with A_a <: B \ S gives A_b <: B \ S. ✓

Apply IH (1) to A_b <: B \ S with A_b ‘s self-eq: get A_b <: B \ S ∪ {X}. ✓

This matches the syntactic derivation ssub_d_andR with A_b <: B \ S ∪ {X}. ✓

Great, the case analysis matches the actual derivation.

Goal (2): Premise B ~ B \ S ∪ {X}. We have B ‘s self-eq, unchanged.

Apply IH (2) to (sub) A_a <: B \ S with B ‘s self-eq: get A_a <: B \ S ∪ {X}. Apply ssub_d_andL. ✓ direct.

This case is easier because we just inherit B ‘s self-eq directly.

Case ssub_d_andR (intersection elimination on left, branch b)

Symmetric to ssub_d_andL, swapping A_a and A_b everywhere. ✓

Summary

All cases close, with the key technical structure:

Arrow case: dual goals (1) and (2) feed each other across contravariance.
Intersection-on-right (ssub_d_and): for goal (2), case-analyze premise 1’s decomposition; in cross-branch subcases, use transitivity through cross-branch fact to construct a smaller sub-derivation, apply IH on that.
Intersection-elim-on-left (ssub_d_andL/R): for goal (1), case-analyze premise 1’s decomposition; mirror the strategy.

The induction is on |A| + |B|. Auxiliary tools used: transitivity modulo S, avoidance-set monotonicity.

Where I want to flag uncertainty

Non-ordinary case nesting. When A_a, A_b, B_a, B_b are themselves intersections, the case analysis nests deeper. I’ve sketched that it terminates because each decomposition reduces type size, but a fully rigorous Coq proof would need to carefully structure the recursion (perhaps via a separate auxiliary lemma about deriving self-equivalence from intersection decomposition).
The transitivity chain in cross-branch cases. I use transitivity in \ S to construct sub-derivations, then apply IH to those. The IH’s hypothesis (self-eq) might not transport perfectly through this construction. I’ve checked the concrete cases that arose, but a Coq formalization would need to verify each step carefully.
Mutual recursion between goals (1) and (2). The proof has IH (1) and IH (2) calling each other in different cases. The induction is well-founded on type-size, but the simultaneous recursion needs to be set up correctly. This is standard but tedious.
What if premise 2’s ssub_d_andL has A_a non-ordinary? I handled this by noting the IH applies to the sub-derivation regardless. But the case analysis on premise 1 needs A_a ordinary in some places (to invert via ssub_d_andL/R). If A_a is non-ordinary, premise 1 inversion goes through ssub_d_and, and we recurse on A_a ‘s structure to get its self-eq. I sketched this but didn’t verify all sub-subcases.

Overall, I believe the proof works, but the non-ordinary nesting is the part most likely to have subtle issues in a full formalization. The ordinary case is solid.

Want me to focus on the non-ordinary nested intersection case more carefully, or try a different angle?

You

Great, I will send lemma 2’ to my agent to start the mechanization. In the meanwhile, can you summarize the rest of the framework up to the original theorem (the inversion theorem, inverting substitutions), so that it can form a complete prompt for the mechanization?

Claude

Complete Framework: Inversion Theorem for Substitutions with Nominal Labels

Background and System

We work in a subtyping system with intersection types, arrow types, records (labels), type variables, Top, and nat as primitive. The system uses avoidance-aware subtyping A <: B \ S, where S is a set of variables that are explicitly forbidden from participating reflexively.

The Avoidance Subtyping Rules (ssub_disjoint)

Inductive ssub_disjoint : typ -> typ -> atoms -> Prop :=
  | ssub_d_nat   : forall D,
      ssub_disjoint typ_nat typ_nat D
  | ssub_d_top   : forall A D,
      lc_typ A ->
      ssub_disjoint A typ_top D
  | ssub_d_var   : forall X D,
      X \`notin\` D ->
      ssub_disjoint (typ_fvar X) (typ_fvar X) D
  | ssub_d_arrow : forall A1 A2 B1 B2 D,
      ssub_disjoint B1 A1 D ->
      ssub_disjoint A2 B2 D ->
      ssub_disjoint (typ_arrow A1 A2) (typ_arrow B1 B2) D
  | ssub_d_and   : forall A B1 B2 D,
      ssub_disjoint A B1 D ->
      ssub_disjoint A B2 D ->
      ssub_disjoint A (typ_and B1 B2) D
  | ssub_d_andL  : forall A1 A2 B D,
      ssub_disjoint A1 B D ->
      lc_typ A2 ->
      ord B ->
      ssub_disjoint (typ_and A1 A2) B D
  | ssub_d_andR  : forall A1 A2 B D,
      ssub_disjoint A2 B D ->
      lc_typ A1 ->
      ord B ->
      ssub_disjoint (typ_and A1 A2) B D
  | ssub_d_rcd   : forall A B l D,
      ssub_disjoint A B D ->
      ssub_disjoint (typ_rcd l A) (typ_rcd l B) D.

Notation Conventions

A <: B \ S means ssub_disjoint A B S.
A ~ B \ S means A <: B \ S and B <: A \ S (mutual subtyping in the avoidance system).
A[X → {X:C}] means substituting the type variable X (i.e., typ_fvar X) with the record type typ_rcd X C.
“Ordinary type” (ord T) means T is not a top-level intersection.
A nominal label X in typ_rcd X _ is a syntactic identifier; we assume that in the original (un-substituted) types A, B, the identifier X does not appear free as a label name — only as a type variable.

Standing Assumptions / Auxiliary Lemmas (assumed admissible)

Transitivity modulo S: If A <: B \ S and B <: C \ S, then A <: C \ S.
Avoidance-set monotonicity: If A <: B \ S ∪ {X} and X ∉ S, then A <: B \ S.
Standard structural size: |T| is defined inductively, with each constructor adding 1 to the sum of component sizes.

These are standard meta-theoretic properties of the system.

The Goal: Inversion Theorem

Theorem (Substitution Inversion). Suppose X ∉ S, and X does not appear free as a label in A or B. If A[X → {X:C}] ~ B[X → {X:D}] \ S syntactic, then either:

C ~ D \ S syntactic, or

A ~ B \ S ∪ {X} syntactic.

The theorem expresses that when two substituted types are equivalent in the avoidance system, either the substituted bodies are themselves equivalent, or the variable X was “inactive” in the original types — formalized by the equivalence holding under the extended avoidance set.

The proof proceeds by induction on |A| + |B|. The hard case is when A = A₁ & A₂ and B is ordinary (and the symmetric case where B is an intersection); within this case, the proof routes through the auxiliary Lemma 2’.

Lemma 1: Substitution-Invariance via Transitivity

Lemma 1. If A ~ B \ S ∪ {X} syntactic, then by transitivity:

B ~ B \ S ∪ {X} syntactic.

A ~ A \ S ∪ {X} syntactic.

Proof: Direct from A <: B, B <: A, and transitivity.

This is used to extract self-equivalence facts about the “good half” of the intersection in the inductive step.

Lemma 2’: The Self-Equivalence Lifting Lemma

Lemma 2’. Suppose A <: B \ S syntactic, and X ∉ S. Then:

If A ~ A \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic.

If B ~ B \ S ∪ {X} syntactic, then A <: B \ S ∪ {X} syntactic.

Proof: By strong induction on |A| + |B|, proving (1) and (2) simultaneously, with case analysis on the last rule of A <: B \ S.

(Complete case analysis already worked out in the previous discussion; this lemma is the one being mechanized first.)

Auxiliary Lemma: Substituent Erasure

Lemma (Substituent Erasure). Suppose X ∉ S, and X does not appear free as a label in A or B. If A[X → {X:C}] <: B[X → {X:D}] \ S syntactic, then A <: B \ S syntactic.

Proof sketch: By induction on the derivation. The key observation is that every typ_rcd X _ in the substituted types arose from substitution (since X does not appear as a label originally). When the derivation hits a ssub_d_rcd rule comparing {X:C} (from LHS) and {X:D} (from RHS), this corresponds to the variable rule ssub_d_var X in the unsubstituted form (since X ∉ S). All other rules transport directly (variables not equal to X, Top -absorption, intersection rules, structural arrow rule).

This lemma converts the original substituted comparison into a comparison on un-substituted types, which can then be fed to Lemma 2’.

Proof of the Inversion Theorem

Setup: Induct on |A| + |B|.

We case-analyze on the structure of A and B.

Trivial cases

When A or B is nat, Top, a non- X variable, or in shapes where X cannot appear meaningfully — direct.

Case: A and B both ordinary, with X participating

The interesting sub-case is when both A and B are arrows or labels with X appearing in matching positions, leading to a forced label comparison {X:C} ⋈ {X:D} in the derivation. By the structure of the comparison and ssub_d_rcd, this gives C ~ D \ S directly. Branch 1.

If X does not actively appear (always in absorbed position), branch 2 follows by structural lifting.

Case: A = A₁ & A₂, B ordinary (the hard case)

By inversion of A[X→{X:C}] ~ B[X→{X:D}] \ S:

(P_left) from A[X→{X:C}] <: B[X→{X:D}] \ S: by ssub_d_andL or ssub_d_andR (since B is ordinary), one of:
- (P_left-a) A₁[X→{X:C}] <: B[X→{X:D}] \ S, OR
  - (P_left-b) A₂[X→{X:C}] <: B[X→{X:D}] \ S.
(P_right) from B[X→{X:D}] <: A[X→{X:C}] \ S: by ssub_d_and (since A[X→{X:C}] is intersection), gives both:
- B[X→{X:D}] <: A₁[X→{X:C}] \ S
  - B[X→{X:D}] <: A₂[X→{X:C}] \ S.

WLOG suppose (P_left-a) [the case for (P_left-b) is symmetric]. Then A₁[X→{X:C}] ~ B[X→{X:D}] \ S (mutual subtyping established).

Apply IH on (A₁, B): size |A₁| + |B| < |A₁ & A₂| + |B|. ✓

The IH gives: either C ~ D \ S (return branch 1), or A₁ ~ B \ S ∪ {X} syntactic.

Take the second case: A₁ ~ B \ S ∪ {X}.

By Lemma 1 applied to this:

B ~ B \ S ∪ {X} syntactic.
A₁ ~ A₁ \ S ∪ {X} syntactic.

Goal: Derive A₁ & A₂ ~ B \ S ∪ {X} syntactic.

<: direction (A₁ & A₂ <: B \ S ∪ {X}):

Apply ssub_d_andL: need A₁ <: B \ S ∪ {X} (and lc_typ A₂, ord B).

A₁ <: B \ S ∪ {X} is half of A₁ ~ B \ S ∪ {X} from the IH. ✓
Side conditions hold by well-formedness assumptions.

:> direction (B <: A₁ & A₂ \ S ∪ {X}):

Apply ssub_d_and: need B <: A₁ \ S ∪ {X} AND B <: A₂ \ S ∪ {X}.

B <: A₁ \ S ∪ {X} is the other half of A₁ ~ B \ S ∪ {X} from the IH. ✓
B <: A₂ \ S ∪ {X} is the non-trivial subgoal. We derive it via Lemma 2’:
- Apply Substituent Erasure to B[X→{X:D}] <: A₂[X→{X:C}] \ S (which we have from (P_right)), obtaining B <: A₂ \ S syntactic.
  - Apply Lemma 2’ (2) with hypothesis B ~ B \ S ∪ {X} (from Lemma 1 above) to B <: A₂ \ S, obtaining B <: A₂ \ S ∪ {X} syntactic. ✓

Combine A₁ & A₂ <: B \ S ∪ {X} and B <: A₁ & A₂ \ S ∪ {X} to conclude A₁ & A₂ ~ B \ S ∪ {X} syntactic. ✓ Branch 2.

Symmetric case: (P_left-b) instead of (P_left-a)

When A₂[X→{X:C}] <: B[X→{X:D}] \ S is the winning branch, apply IH on (A₂, B) instead. The reasoning is symmetric: get A₂ ~ B \ S ∪ {X}, lift B <: A₁ \ S ∪ {X} via Lemma 2’(2), use ssub_d_andR for the <: direction.

Case: B = B₁ & B₂ with A ordinary (symmetric to the above)

The roles of A and B swap. The proof structure mirrors the above:

One branch of B participates fully (A ~ B_i \ S ∪ {X} from IH).
Lemma 1 gives A ‘s self-equivalence.
Lemma 2’ (1) lifts the asymmetric direction A <: B_j \ S to A <: B_j \ S ∪ {X}.

Case: Both A and B are intersections

Decompose both. The case analysis branches more, but each combination reduces to one of the above.

Order of Mechanization (Suggested)

Foundational properties (likely already in the formalization):
- Locally closedness (lc_typ).
  - Ordinariness (ord).
  - Transitivity of ssub_disjoint.
  - Avoidance-set monotonicity.
  - Type size measure.
Lemma 2’ (the current target):
- State as simultaneous (1) and (2).
  - Strong induction on |A| + |B|.
  - Case analysis on last rule of A <: B \ S.
  - Auxiliary structural reasoning for non-ordinary intersection branches.
Substituent Erasure Lemma:
- Induction on the derivation of A[X→{X:C}] <: B[X→{X:D}] \ S.
  - Case analysis on rules; the ssub_d_rcd case (with l = X) corresponds to ssub_d_var X in the un-substituted form.
Lemma 1 (trivial via transitivity).
Inversion Theorem:
- Strong induction on |A| + |B|.
  - Case analysis on the structures of A and B.
  - Hard case: one of them an intersection. Use IH on a strictly smaller pair, then use Lemma 1 + Lemma 2’ + Substituent Erasure to lift the asymmetric direction.

Caveats and Subtleties to Watch For

Non-ordinary nested intersections: In Lemma 2’s intersection cases (ssub_d_and, ssub_d_andL, ssub_d_andR), if any of A_a, A_b, B_a, B_b are themselves intersections, the case analysis on premise 1’s decomposition recurses on type structure. The termination is via type-size, but the bookkeeping is delicate.
Substituent Erasure assumes X doesn’t appear as a label originally. This is a hypothesis of the inversion theorem and must be threaded through.
Transitivity is used in cross-branch cases of Lemma 2’. If your transitivity lemma has special conditions or restrictions, those may need to be checked at each use.
The simultaneous induction on (1) and (2) of Lemma 2’ is essential — the contravariant arrow case requires the dual hypothesis. Single-goal proofs of either (1) or (2) alone will get stuck.
Branch 1 of the Inversion Theorem (C ~ D \ S) comes naturally from cases where the derivation forces a label comparison {X:C} vs {X:D}. Carefully identify in the proof where this case arises (typically in the structural-matching base cases).
The “X does not appear as a label” condition is critical — without it, typ_rcd X _ could appear intrinsically in A or B, and the Substituent Erasure transformation would not be well-defined.

This should give a complete picture for the mechanization. The Lemma 2’ proof is the technical heart; once it’s in place, the rest of the framework assembles around it relatively cleanly. Let me know if you want me to elaborate any specific part further or revise the framing of any step.

You

Issue noted!

k: nat
IHk: forall (A B: typ) (S: atoms), size_typ A + size_typ B < k → forall X: atom, X notin S → ssub_disjoint A B S → (ssub_disjoint A A (add X S) → ssub_disjoint A B (add X S)) /\ (ssub_disjoint B B (add X S) → ssub_disjoint A B (add X S))
B: typ
S: atoms
A1, A2: typ
Hsiz: Datatypes.S (size_typ A1 + size_typ A2 + size_typ B) < Datatypes.S k
X: atom
Hnin: X notin S
Heq: ssub_disjoint (typ_and A1 A2) (typ_and A1 A2) (add X S)
Hsub1: ssub_disjoint A1 B S
Hsub2: lc_typ A2
Hsub3: ord B
Heq1: ssub_disjoint (typ_and A1 A2) A1 (add X S)
Heq2: ssub_disjoint (typ_and A1 A2) A2 (add X S)
Heq1b: ssub_disjoint A2 A1 (add X S)
Heq2b: ssub_disjoint A2 A2 (add X S) (1 / 1) ssub_disjoint (typ_and A1 A2) B (add X S)

Using the proof technique we have described, although we can analyze the reflexive subtyping relation when the type is an intersection and get some reflexive results for its subcomponents from the equality condition. I should say that we cannot ensure that the reflexive part we get that avoids X is consistent with the subtyping between A and B. You can see this from the proof script; I got stuck at the proof state above.

Am I understanding the proof technique you described correctly? Or this is a real problem and we should (and how to) fix?

Claude

Analyzing the Stuck State

Let me read the proof state carefully to make sure I understand where you’re stuck.

What you have at the stuck state

You’re in the ssub_d_andL case (intersection elimination on the left of premise 2):

Hsub1 : A1 <: B \ S (the surviving sub-derivation)
Hsub2 : lc_typ A2, Hsub3 : ord B
Heq : (A1 & A2) <: (A1 & A2) \ S ∪ {X} (premise 1, self-eq of the intersection)

You inverted Heq to get:

Heq1 : (A1 & A2) <: A1 \ S ∪ {X}
Heq2 : (A1 & A2) <: A2 \ S ∪ {X}

And further inverted Heq1 (assuming A1 ordinary) to get the cases:

Heq1a (subcase via-andL): A1 <: A1 \ S ∪ {X} — A1’s self-eq directly. Handled, easy.
Heq1b (subcase via-andR): A2 <: A1 \ S ∪ {X} — cross-branch.

In the cross-branch case Heq1b, you proceeded to invert Heq2, getting:

Heq2a (via-andL): A1 <: A2 \ S ∪ {X} — combined with Heq1b: A2 <: A1 gives A1 ‘s self-eq via transitivity.
Heq2b (via-andR): A2 <: A2 \ S ∪ {X} — A2’s self-eq.

You handled Heq2a via the transitivity chain. The stuck state is in the Heq2b subcase:

Heq1b : A2 <: A1 \ S ∪ {X}
Heq2b : A2 <: A2 \ S ∪ {X}
Hsub1 : A1 <: B \ S
Goal: A1 & A2 <: B \ S ∪ {X}

What I had claimed would work

In my earlier sketch, I said:

Construct A_b <: B \ S: monotonicity on A_b <: A_a \ S ∪ {X} gives A_b <: A_a \ S; transitivity with A_a <: B \ S gives A_b <: B \ S. ✓ smaller.

Apply IH (1) to A_b <: B \ S with A_b ‘s self-eq: get A_b <: B \ S ∪ {X}. Apply ssub_d_andR. ✓

In your notation (with A1 = A_a, A2 = A_b):

From Heq1b : A2 <: A1 \ S ∪ {X}, apply monotonicity to get A2 <: A1 \ S.
Transitivity with Hsub1 : A1 <: B \ S to get A2 <: B \ S.
Apply IH (1) on A2 <: B \ S with Heq2b : A2 <: A2 \ S ∪ {X} to get A2 <: B \ S ∪ {X}.
Apply ssub_d_andR to get A1 & A2 <: B \ S ∪ {X}. ✓

In your script, you did the monotonicity-and-transitivity step in the Heq2a case via ssub_disjoint_trans Heq2a Heq1b (chaining A1 <: A2 and A2 <: A1 to get A1 <: A1). For the Heq2b case, the analogous move would be:

ssub_disjoint_trans Heq1b Hsub1 -via-monotonicity: get A2 <: B \ S.
Apply IH (1) with Heq2b to get A2 <: B \ S ∪ {X}.
Apply ssub_d_andR.

So is this a real problem, or just an unfinished case?

Looking at your proof script, I see you’ve handled Heq2a via the transitivity chain (using ssub_disjoint_trans Heq2a Heq1b). The stuck state above is in the Heq2b subcase, which I think is the case I described as closable via transitivity-through- A2.

If you have:

Heq1b : A2 <: A1 \ S ∪ {X}
Heq2b : A2 <: A2 \ S ∪ {X}
Hsub1 : A1 <: B \ S

Then the chain is:

A2 <: A1 \ S (monotonicity from Heq1b).
A2 <: B \ S (transitivity of (1) with Hsub1).
A2 <: B \ S ∪ {X} (IH (1) with Heq2b).
A1 & A2 <: B \ S ∪ {X} (apply ssub_d_andR with lc_typ A1 from Hsub2 — wait, Hsub2 : lc_typ A2. Need lc_typ A1 for ssub_d_andR. We get this from the well-formedness of Hsub1.)

So the case should close with this chain.

What might be confusing

You wrote:

I should say that we cannot ensure that the reflexive part we get that avoids X is consistent with the subtyping between A and B.

I think you might be worried about whether the “self-equivalence of A2 ” (from Heq2b) is the right self-eq for A2 to combine with Hsub1: A1 <: B \ S. But the proof doesn’t need consistency of that kind — it constructs a new sub-derivation A2 <: B \ S via transitivity (using the cross-branch A2 <: A1), and then applies the IH on that constructed sub-derivation, which is independent of the original A1 <: B \ S route in premise 2.

The IH gives A2 <: B \ S ∪ {X} regardless of how A1 <: B \ S was derived. The ssub_d_andR rule then closes the goal.

Concrete suggestion

Try in Coq:

* (* Heq2b case *)
  forwards Hmon: ssub_disjoint_monotonic Heq1b. (* A2 <: A1 \ S *)
  forwards Htrans: ssub_disjoint_trans Hmon Hsub1. (* A2 <: B \ S *)
  forwards (IH1 & IH2): IHk X Htrans. { lia. }
  (* IH1 : A2 <: A2 \ S ∪ {X} -> A2 <: B \ S ∪ {X} *)
  forwards Hres: IH1 Heq2b. (* A2 <: B \ S ∪ {X} *)
  apply ssub_d_andR; auto.
  (* Side conditions: lc_typ A1, ord B *)

You’ll need:

ssub_disjoint_monotonic: A <: B \ S ∪ {X} ∧ X ∉ S → A <: B \ S.
ssub_disjoint_trans: A <: B \ S ∧ B <: C \ S → A <: C \ S.
lc_typ A1 derivable from Hsub1 or other hypotheses.

For the size check on the IH call: |A2| + |B| < |A1 & A2| + |B| since |A1 & A2| = 1 + |A1| + |A2| > |A2|. ✓

Am I understanding your concern correctly?

I want to double-check, because your description (“we cannot ensure that the reflexive part… is consistent with the subtyping between A and B”) could mean something I’m not seeing. Let me articulate two possible concerns:

Concern A: “The self-eq of A2 in Heq2b doesn’t directly tell me how A2 relates to B; I only have A1 <: B, not A2 <: B.”

Resolution: Construct A2 <: B \ S via the cross-branch A2 <: A1 \ S ∪ {X} (mono + transitivity). The proof doesn’t need A2 <: B as input; it builds it.

Concern B: “The cross-branch fact A2 <: A1 \ S ∪ {X} is in the extended avoidance system, not \ S. I can’t directly chain it with A1 <: B \ S (which is in \ S).”

Resolution: Use avoidance-set monotonicity to weaken A2 <: A1 \ S ∪ {X} to A2 <: A1 \ S, then chain via transitivity in \ S.

Concern C: “Even with the transitivity chain, the resulting A2 <: B \ S might not be derivable in the original system if A2 ‘s structure is incompatible with B.”

Resolution: This isn’t a concern — transitivity is admissible, so the chain is always valid.

Could you tell me which of these matches your concern, or describe more precisely what you’re worried about? My current read is that the case closes with the standard cross-branch + transitivity chain (Concerns A & B), and the proof script just needs the missing tactic invocation to put the pieces together.

If you’ve tried this and it doesn’t actually close in Coq (e.g., monotonicity or transitivity fails for some technical reason), that would be useful to know — it might point to a real gap I’m missing.

You

Thank you for your solution! The case works for me!

Now the only thing left is that we made a simplification for analyzing intersections, that assume A1 and A2 are ordinary for A1 & A2

Here is the generalization I tried to solve the general case

Do you think it is feasible?

Here is my current proof focusing on Goal (1), where the only case left is the ordinary (admitted, but is actually simplification, cannot be solved) condtion I admit when applying the inversion theorem on A1 & A2 <: A1 & A2 case

Lemma ssub_lift: forall k A B S, size_typ A + size_typ B < k → forall X, X notin S → ssub_disjoint A B S → (ssub_disjoint A A (add X S) → ssub_disjoint A B (add X S)) /\ (ssub_disjoint B B (add X S) → ssub_disjoint A B (add X S)). Proof with auto. induction k; introv Hsiz Hnin Hsub. { exfalso. lia. } split. { intros Heq. inverts Hsub as Hsub1 Hsub2 Hsub3; simple_inverting; auto. - (* A1 → A2 <: B1 → B2 *) forwards* (IH1a & IH2a): IHk X Hsub1. { lia. } forwards: IH2a H4. forwards* (IH1b & IH2b): IHk X Hsub2. { lia. } - (* A <: B1 & B2 *) simpl in *. forwards* (IH1a & IH2a): IHk X Hsub1. { lia. } forwards* (IH1b & IH2b): IHk X Hsub2. { lia. } - (* A1 & A2 <: B *) forwards (Heq1 & Heq2): ssub_andr_inv Heq. forwards [Heq1a|Heq1b]: ssub_andl_inv Heq1; simpl in *; auto. { admit. } (* What if A1 is not ordinary? A1… &… An < B where Ai (i = 1… n) are ordinary Case on fixed (i): Given Ai <: B \ S A1… &… An <: A1… &… An \ S∪{X} ⇐> For each j, exists i, Ai <: Aj \ S∪{X} To prove, we need to show exists an k, Ak <: Ak \ S∪{X} and Ak <: Ai such that Ak <: B \ S∪{X} (By IHk) so A1… &… An <: B B \ S∪{X} (By transitivity) *) (* admit the ordinary condition, for simplification *) (* TODO: generalize to nested ord/and *) + (* A1 ~ A1 \ S∪{X} implies A1 <: B \ S∪{X} *) forwards*: IHk X Hsub1. { lia. } + (* A2 <: A1 \ S∪{X} implies A2 <: A1 <: B *) forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto. { admit. } (* admit the ordinary condition, for simplification *) * forwards Heq’: ssub_disjoint_trans Heq2a Heq1b. { apply Nat.lt_succ_diag_r. } forwards*: IHk Hsub1. { lia. } * (* Heq2b case *) forwards Hmon: ssub_disjoint_weakening Heq1b S. { fsetdec. } (* A2 <: A1 \ S *) forwards Htrans: ssub_disjoint_trans Hmon Hsub1. (* A2 <: B \ S *) { apply Nat.lt_succ_diag_r. } forwards (IH1 & IH2): IHk X Htrans… { lia. } (* IH1: A2 <: A2 \ S ∪ {X} → A2 <: B \ S ∪ {X} *) (* Side conditions: lc_typ A1, ord B *) (* admit. *) (* Clever proof, not stuck! *) - (* A1 & A2 <: B *) forwards (Heq1 & Heq2): ssub_andr_inv Heq. forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto. { admit. } (* admit the ordinary condition, for simplification *) (* TODO: generalize to nested ord/and *) + (* A1 <: A2 \ S∪{X} *) forwards [Heq1a|Heq1b]: ssub_andl_inv Heq1; simpl in *; auto. { admit. } (* admit the ordinary condition, for simplification *) * (* A1 ~ A1 <: A2 \ S∪{X} IH requires A2 ~ A2 \ S∪{X} *) forwards Hmon: ssub_disjoint_weakening Heq2a S. { fsetdec. } (* A1 <: A1 \ S *) forwards Htrans: ssub_disjoint_trans Hmon Hsub1. (* A1 <: A2 \ S *) { apply Nat.lt_succ_diag_r. } forwards (IH1 & IH2): IHk X Htrans… { lia. } (* IH1: A2 <: A2 \ S ∪ {X} → A2 <: B \ S ∪ {X} *) (* Clever proof, not stuck! *) * forwards Heq’: ssub_disjoint_trans Heq1b Heq2a. { apply Nat.lt_succ_diag_r. } forwards*: IHk Hsub1. { lia. } + (* A2 ~ A2 \ S∪{X} implies A2 <: B \ S∪{X} *) forwards*: IHk X Hsub1. { lia. } - (* {l: A} <: {l: B} *) forwards* (IH1a & IH2a): IHk X Hsub1. { lia. }

}

Claude

Generalizing to Non-Ordinary Components

Your idea is exactly right. Let me work through whether it’s feasible and what’s needed to make it precise.

Restating your approach

You wrote:

What if A1 is not ordinary?
A1 ... & ... An < B where Ai (i = 1 ... n) are ordinary
Case on fixed (i):
  Given Ai <: B \ S
        A1 ... & ... An <: A1 ... & ... An \ S∪{X}
        <=> For each j, exists i, Ai <: Aj \ S∪{X}
  To prove, we need to show exists an k,
        Ak <: Ak \ S∪{X}
        and Ak <: Ai
  such that
        Ak <: B \ S∪{X} (By IHk)
  so    A1 ... & ... An <: B B \ S∪{X} (By transitivity)

This is the right structural intuition. Let me formalize it.

The key generalization

When A1 & A2 is the LHS but A1 (or A2) is itself non-ordinary, we should think of the whole LHS as a flat conjunction of ordinary components: A_(1) & A_(2) & ... & A_(n) where each A_(i) is ordinary. The intersection structure is just associativity-modulo-grouping.

Key observation: in the avoidance system, (A1 & A2) <: B \ S with B ordinary corresponds (by repeated application of ssub_d_andL/R) to picking some ordinary component A_(i) such that A_(i) <: B \ S.

Similarly, (A1 & A2) <: (A1 & A2) \ S ∪ {X} corresponds to: for each ordinary component A_(j) of the RHS, there exists some ordinary component A_(i) of the LHS with A_(i) <: A_(j) \ S ∪ {X}.

This is your “for each j, exists i” condition.

The proof strategy generalized

Let me restate:

Setup: A1 & A2 = A_(1) & ... & A_(n) (flat conjunction of ordinary components). Premise 2 gives, for some specific i*, A_(i*) <: B \ S (the surviving sub-derivation of premise 2 picks one component). Premise 1 gives, for each j, some i_j with A_(i_j) <: A_(j) \ S ∪ {X}.

Goal: A1 & A2 <: B \ S ∪ {X}, which by repeated ssub_d_andL/R reduces to: find some k such that A_(k) <: B \ S ∪ {X}.

Strategy to find such k: We need:

A_(k) <: B \ S ∪ {X}, derivable (by IH (1)) from A_(k) <: B \ S and A_(k) ‘s self-eq.
A_(k) <: B \ S derivable: chain A_(k) <: A_(i*) \ S (via monotonicity from A_(k) <: A_(i*) \ S ∪ {X}) and A_(i*) <: B \ S (premise 2).
So we need A_(k) <: A_(i*) \ S ∪ {X} (i.e., k = i_{i*} — the component that covers A_(i*) in premise 1).
And A_(k) ~ A_(k) \ S ∪ {X} (self-eq of A_(k)).

So the candidate k is i_{i*} — the LHS component that, in premise 1, covers A_(i*) (the component that participates in premise 2).

Question: does A_(k) (where k = i_{i*}) have self-equivalence in \ S ∪ {X}?

By premise 1, for each j, some i_j covers A_(j). In particular, for j = k, some i_k with A_(i_k) <: A_(k) \ S ∪ {X}.

If i_k = k, then A_(k) <: A_(k) \ S ∪ {X} — direct self-eq. ✓

If i_k ≠ k, then A_(i_k) <: A_(k) \ S ∪ {X} is cross-branch. We’d then need to chain further to derive A_(k) ‘s self-eq, possibly via transitivity.

The structural lemma we need

This generalization needs an inductive lemma about flat intersections. Let me state it.

Lemma (Generalized Cross-Branch). Suppose A = A_(1) & ... & A_(n) (flat intersection of ordinary components, in some grouping). Suppose A <: A \ S ∪ {X}. Then for any i* and any B such that A_(i*) <: B \ S and B is ordinary, there exists some k such that:

A_(k) <: B \ S ∪ {X} (so we can conclude A <: B \ S ∪ {X} via repeated intersection elim).

The proof of this auxiliary lemma involves a “chasing” argument: start with i*, find i_{i*} covering A_(i*). Check if A_(i_{i*}) has self-eq directly. If not, find i_{i_{i*}} covering A_(i_{i*}). Continue.

This chase either:

(a) Terminates at some k with self-eq directly (i.e., i_k = k), or
(b) Produces a cycle k_1 → k_2 → ... → k_m → k_1 (where each A_(k_p) <: A_(k_{p+1}), modulo indexing). By transitivity around the cycle, A_(k_1) <: A_(k_1) \ S ∪ {X} — self-eq of A_(k_1).

Either way, we find some k in the cycle (or at the terminus) with self-eq, and A_(k) <: A_(i*) by transitivity along the chase.

A concrete rephrasing

Since the indices i_j form a function from {1, ..., n} to itself, the “chase” starting at any element ends in a cycle (since the domain is finite). Every element on the cycle has self-eq (by transitivity around the cycle). And every chase ends in a cycle with each element on the cycle being <: to the chase’s starting point (by composition of the transitivity steps along the chase).

So: for the chase starting at i*, the cycle’s elements all have self-eq, and they all are <: A_(i*) (transitively). Pick any element k on the cycle.

Then:

A_(k) <: A_(k) \ S ∪ {X} (self-eq from cycle).
A_(k) <: A_(i*) \ S ∪ {X} (from chase).
A_(k) <: A_(i*) \ S (monotonicity).
A_(i*) <: B \ S (premise 2).
A_(k) <: B \ S (transitivity).
A_(k) <: B \ S ∪ {X} (IH (1) with self-eq).
A <: B \ S ∪ {X} (intersection elim along the path from A to A_(k)).

✓

Feasibility for mechanization

The “chase to a cycle” argument is structurally fine for mechanization, but it’s not entirely trivial. Some considerations:

Option 1: Flatten the intersection up-front.

Define a function flatten : typ → list typ that turns A1 & A2 into a list of ordinary components, then prove premise 1 in the flat form (each RHS component is covered by some LHS component, giving an indexing function), then do the chase.

This is conceptually clean but requires substantial setup (flattening function, the equivalence between intersection and flat form, the “covers” relation, etc.).

Option 2: Work directly with the binary intersection structure.

Instead of flattening, do the case analysis on the binary tree structure of A1 & A2. When A1 is itself an intersection A_aa & A_ab, recurse on A1 ‘s structure, treating A_aa & A_ab & A_2 as a 3-way intersection (in some grouping).

This avoids the flattening setup but the recursion structure is more involved.

Option 3: Strengthened induction with a direct cycle argument.

State and prove a generalized lemma that does the chase directly, without explicitly flattening.

I think Option 1 is cleanest for mechanization, despite the upfront cost. Here’s why:

The chase argument is clearer when stated over a flat list.
The cycle finding is more intuitive (it’s just iterating a function on a finite set).
The flattening can be done once and reused for both Goal (1) and Goal (2).
Many Coq libraries have list infrastructure that helps.

Let me sketch what Option 1 looks like

(* Flatten an intersection into a list of ordinary components *)
Fixpoint flatten (T : typ) : list typ :=
  match T with
  | typ_and T1 T2 => flatten T1 ++ flatten T2
  | _ => [T]
  end.

(* Lemma: A <: B \ S iff exists T in flatten A, T <: B \ S, when B ordinary *)
Lemma flatten_andL_inv : forall A B S,
  ord B ->
  ssub_disjoint A B S ->
  exists T, In T (flatten A) /\ ssub_disjoint T B S /\ ord T.

(* Lemma: A <: A \ S iff for each T in flatten A, exists T' in flatten A, T' <: T \ S *)
Lemma flatten_self_eq_inv : forall A S,
  ssub_disjoint A A S ->
  forall T, In T (flatten A) ->
  exists T', In T' (flatten A) /\ ssub_disjoint T' T S /\ ord T.

(* Lemma: cycle finding *)
Lemma cycle_in_self_map : forall (l : list typ) (f : typ -> typ),
  (forall T, In T l -> In (f T) l) ->
  forall T, In T l ->
  exists T', In T' l /\ T' = ... (* on cycle reachable from T *)

Then the main proof’s intersection case becomes a relatively clean application of these lemmas.

The cycle argument’s correctness

Let me double-check the cycle argument. We have a finite list [A_1, ..., A_n] of ordinary components and a function f : {1,...,n} → {1,...,n} where f(j) = i_j is the index of the component that covers A_j (i.e., A_{f(j)} <: A_j \ S ∪ {X}).

Starting at any j_0, iterate: j_0, f(j_0), f²(j_0), .... By pigeonhole, some f^p(j_0) = f^q(j_0) for p < q. The cycle is f^p(j_0), f^{p+1}(j_0), ..., f^{q-1}(j_0), f^p(j_0).

Wait, I had the direction wrong earlier. Let me redo.

f(j) is the index covering j, so A_{f(j)} <: A_j. Then chasing forward: A_{f²(j)} <: A_{f(j)} <: A_j, etc. So forward chase gives smaller-and-smaller types in the <: order — going “down” toward subtypes.

The cycle: A_{j_p} <: A_{j_{p-1}} <: ... <: A_{j_0}. If j_q = j_p for some q > p, then transitively A_{j_p} <: A_{j_p}, which is just the standard reflexivity-modulo- S∪{X} for A_{j_p}. ✓ self-eq of A_{j_p}.

And A_{j_p} <: A_{j_0} \ S ∪ {X} by transitivity along the chase.

So with j_0 = i*: pick j_p on the cycle. Have A_{j_p} ‘s self-eq and A_{j_p} <: A_{i*} \ S ∪ {X}. ✓

Caveats

A few honest flags:

Premise 1’s structure for non-ordinary A_j: I said “for each j, some i_j with A_{i_j} <: A_j,” but this is only directly the case when A_j is ordinary. If A_j is non-ordinary, premise 1’s decomposition uses ssub_d_and (introducing an intersection on the right), and we need to recurse into A_j ‘s structure. This is exactly what flatten handles: by flattening both sides, the components are all ordinary, and the indexing function is well-defined.
The function f might not be unique. Multiple i_j ‘s might cover A_j. For the cycle argument, any choice works — pick one.
Termination of the chase. Pigeonhole gives termination, but mechanizing pigeonhole on lists requires some care (e.g., via NoDup lists or by bounding chase length by list length).
The chase argument generalizes to Goal (2) symmetrically. Same setup, same cycle argument.
I’ve assumed transitivity, monotonicity, and the ability to convert between binary intersection structure and flat list. All standard but need to be in your formalization.
The flattening lemmas (flatten_andL_inv, etc.) themselves need proof, but should be straightforward inductions on the binary intersection structure.

My honest assessment

Yes, your idea is feasible and correct. The cycle argument formalizes the intuition “follow the cross-branch covers until you find a self-eq cycle.” The mechanization is non-trivial — it requires flattening infrastructure or careful binary recursion — but it’s a standard kind of meta-theoretic argument and shouldn’t have any deep obstacles.

If you want to avoid the flattening overhead, the alternative is to do the case analysis directly on binary intersections, but I expect that to be more painful in terms of bookkeeping. Flattening is conceptually cleaner.

The key new lemma (which I’d suggest mechanizing as a standalone) is:

Cycle Lemma: For any finite list l of types and any function f : (indices of l) → (indices of l) such that A_{f(j)} <: A_j \ S ∪ {X} for all j, and any starting index i*, there exists k with A_k <: A_k \ S ∪ {X} and A_k <: A_{i*} \ S ∪ {X}`.

Once you have this, the non-ordinary case of Lemma 2’ closes cleanly.

Want me to flesh out the proof of the Cycle Lemma in more detail, or is this enough to proceed with the mechanization?

You

Write a short summary of progress to my phd advisor. This is my weekly update, he alerady knows about the lemma (so just a few words descfribing the theorem and notation for reminder)

Then show the proof, and the next step is to complete the cycle lemma reasoning

Claude

Weekly Update

Progress

Made significant progress on the substitution inversion theorem for our subtyping system with intersection types and nominal labels. Recall the statement: if A[X→{X:C}] ~ B[X→{X:D}] \ S (mutual subtyping under avoidance set S), then either C ~ D or A ~ B \ S ∪ {X}. The proof routes through an auxiliary Lemma 2’ that lifts subtyping from \ S to \ S ∪ {X} given self-equivalence of one side.

The key technical insight was formulating Lemma 2’ with dual goals proven simultaneously, which handles the contravariance flip in the arrow case cleanly:

Lemma 2’. Suppose A <: B \ S and X ∉ S. Then:

A ~ A \ S ∪ {X} ⟹ A <: B \ S ∪ {X}

B ~ B \ S ∪ {X} ⟹ A <: B \ S ∪ {X}

Proof by strong induction on |A| + |B|, case-analyzing on the last rule of A <: B \ S. Goals (1) and (2) feed each other across the contravariant arrow position, which avoids needing self-equivalence of both sides as input.

The intersection case was the most subtle. The naive subclaim “self-equivalence of T₁ & T₂ decomposes into self-equivalence of components” turns out to be false — there’s a counterexample like T₁ = X -> Top, T₂ = Top -> Top where the intersection has self-equivalence in \ S ∪ {X} via cross-branch subsumption (T₂ <: T₁) without T₁ itself having self-equivalence.

The fix: case-analyze on premise 1’s decomposition, and in cross-branch cases, construct a smaller sub-derivation via transitivity through the cross-branch fact, then recurse with the IH on the other branch’s self-equivalence. This adaptively switches which intersection rule (ssub_d_andL or ssub_d_andR) is used in the conclusion based on the premise’s structure.

The proof script is now working modulo one remaining simplification: the case analysis assumes A1 and A2 are ordinary, which I admitted in the script.

Next Step

Generalize the intersection case to non-ordinary components. The plan is to formulate a Cycle Lemma that flattens nested intersections into lists of ordinary components and uses a pigeonhole-style cycle argument:

Cycle Lemma (planned). Given the flat ordinary components A_1, ..., A_n of an intersection with (A_1 & ... & A_n) <: (A_1 & ... & A_n) \ S ∪ {X}, for each j some A_{f(j)} <: A_j \ S ∪ {X}. Iterating f from any starting index produces a cycle (by pigeonhole), and elements on the cycle have self-equivalence by transitivity around the cycle.

This will close the remaining admit s in the Lemma 2’ proof and then I can wire up the full inversion theorem.

You

He knows the lemma statement, but didn’t know the concrete proof framework I am taking, so having more context on how lemma 2’ comes up will be useful

Claude

Weekly Update

Progress

Made significant progress on the substitution inversion theorem for our subtyping system with intersection types and nominal labels: if A[X→{X:C}] ~ B[X→{X:D}] \ S, then either C ~ D or A ~ B \ S ∪ {X}.

Proof framework

I worked out the overall proof structure as induction on |A| + |B| with case analysis on the shapes of A and B. The hard case is when A is an intersection A1 & A2 and B is ordinary. After inverting the equivalence:

The <: direction picks one branch (WLOG A1[X→{X:C}] <: B[X→{X:D}] \ S).
The :> direction gives both B[X→{X:D}] <: A1[X→{X:C}] \ S and B[X→{X:D}] <: A2[X→{X:C}] \ S.

So A1[X→{X:C}] ~ B[X→{X:D}] \ S mutually, and we can apply IH to (A1, B). In the non-trivial branch, this gives A1 ~ B \ S ∪ {X}. From this, transitivity immediately yields B ~ B \ S ∪ {X} (the self-equivalence of B in the extended avoidance system).

To finish, we need to derive B <: A2 \ S ∪ {X} for the conclusion’s :> direction. We have B[X→{X:D}] <: A2[X→{X:C}] \ S, and a separate Substituent Erasure lemma converts this into B <: A2 \ S (un-substituting safely since X doesn’t appear as a label originally).

So the remaining gap is exactly:

Given B <: A2 \ S and B ~ B \ S ∪ {X}, derive B <: A2 \ S ∪ {X}.

This is the lifting principle that became Lemma 2’.

Lemma 2’ formulation

The natural statement needs to handle both directions for the contravariance flip in arrows. Stating it with dual goals proven simultaneously is what makes the induction go through:

Lemma 2’. Suppose A <: B \ S and X ∉ S. Then:

A ~ A \ S ∪ {X} ⟹ A <: B \ S ∪ {X}

B ~ B \ S ∪ {X} ⟹ A <: B \ S ∪ {X}

The two goals feed each other across the contravariant arrow position: when recursing into the contravariant subgoal B_dom <: A_dom, the side that had self-equivalence in the outer goal corresponds to the opposite side of the inner subgoal, so we switch between (1) and (2) at each contravariance flip. This avoids needing self-equivalence of both sides as input.

Intersection case subtlety

The intersection case was the most delicate. The naive subclaim “self-equivalence of T1 & T2 decomposes into self-equivalence of components” is false: with T1 = X -> Top, T2 = Top -> Top, the intersection has self-equivalence via cross-branch subsumption (T2 <: T1) without T1 itself having self-equivalence.

The fix: case-analyze on premise 1’s decomposition, and in cross-branch cases, construct a smaller sub-derivation via transitivity through the cross-branch fact, then recurse with IH on the other branch’s self-equivalence. This adaptively switches which rule (ssub_d_andL or ssub_d_andR) is used in the conclusion based on the premise’s structure.

The proof script is now working modulo one remaining simplification: the case analysis assumes intersection components are ordinary, which I admitted.

Next Step

Generalize the intersection case to non-ordinary components. The plan is a Cycle Lemma that flattens nested intersections into lists of ordinary components and uses a pigeonhole-style cycle argument:

Cycle Lemma (planned). Given flat ordinary components A_1, ..., A_n with (A_1 & ... & A_n) <: (A_1 & ... & A_n) \ S ∪ {X}, for each j some A_{f(j)} <: A_j \ S ∪ {X}. Iterating f from any starting index produces a cycle (by pigeonhole), and elements on the cycle have self-equivalence by transitivity around the cycle.

This will close the remaining admit s in Lemma 2’, after which I can wire up the full inversion theorem.

You

Finish the FILL IN HERE and give (local) suggestions on improving the language

Claude

Filled-In Sections

Here are the two FILL-IN-HERE sections, written to fit your draft’s tone and level of detail.

First FILL-IN-HERE (after “Formally, the proof draft is:”)

The proof proceeds by induction on |A| + |B|, simultaneously proving (1) and (2), with case analysis on the last rule of A <: B \ S. The dual formulation is essential for the arrow case: when recursing into the contravariant subgoal B_dom <: A_dom, the side that had self-equivalence in the outer goal corresponds to the opposite side of the inner subgoal, so we alternate between invoking (1) and (2) at each contravariance flip.

The intersection case (focusing on Goal (1), with A = A1 & A2, B ordinary, sub-derivation A1 <: B \ S, premise A1 & A2 ~ A1 & A2 \ S ∪ {X}) splits via inversion of the premise into:

Heq1 : A1 & A2 <: A1 \ S ∪ {X}
Heq2 : A1 & A2 <: A2 \ S ∪ {X}

Each of these (with A1, A2 ordinary, by the simplifying assumption) further inverts via ssub_d_andL or ssub_d_andR into four subcases. The interesting ones are:

Heq1 via-andL: A1 <: A1 \ S ∪ {X} directly. Apply IH (1) to A1 <: B \ S with A1 ‘s self-eq.
Heq1 via-andR + Heq2 via-andL: gives A2 <: A1 \ S ∪ {X} and A1 <: A2 \ S ∪ {X}. Transitivity yields A1 <: A1 \ S ∪ {X}, reducing to the previous subcase.
Heq1 via-andR + Heq2 via-andR: gives A2 <: A1 \ S ∪ {X} (cross-branch) and A2 <: A2 \ S ∪ {X} (self-eq of A2). Construct A2 <: B \ S by monotonicity of Heq1b to \ S followed by transitivity with A1 <: B \ S. Apply IH (1) to this constructed sub-derivation with A2 ‘s self-eq, obtaining A2 <: B \ S ∪ {X}. Conclude via ssub_d_andR.

Goal (2) and the symmetric ssub_d_andR case follow by analogous reasoning.

Second FILL-IN-HERE (after “the path… <: Ax / S”)

This is captured by a Cycle Lemma:

Cycle Lemma. Let A1, ..., An be ordinary types with A1 & ... & An <: A1 & ... & An \ S ∪ {X}. Then for each j, some f(j) satisfies A_{f(j)} <: A_j \ S ∪ {X}. For any starting index i*, iterating f produces (by pigeonhole on {1, ..., n}) a cycle, and any element k on this cycle satisfies:

A_k <: A_k \ S ∪ {X} (by transitivity around the cycle)

A_k <: A_{i*} \ S ∪ {X} (by transitivity along the chase from i* to the cycle)

In the proof of Lemma 2, the flattened intersection on the LHS of Hsub1 provides some ordinary A_{i*} with A_{i*} <: B \ S. The Cycle Lemma yields a k with self-eq and A_k <: A_{i*}. Monotonicity and transitivity give A_k <: B \ S; IH (1) lifts to A_k <: B \ S ∪ {X}; and the original intersection structure recovers A1 & ... & An <: B \ S ∪ {X} by the intersection elimination rules.

The mechanization will likely flatten intersections into lists upfront, prove inversion lemmas (flatten_andL_inv, flatten_self_eq_inv) bridging the binary tree structure to the list view, and then state the Cycle Lemma over the list.

Local Language Suggestions

A few small things I’d tweak. Most are minor — your writing is clear and well-organized.

Opening:

“Sorry for the late update. I got stuck on proving the substitution inversion theorem and took some extra time to think about the proof. I have a good idea now.”

The “Sorry for the late update” framing positions the message defensively; consider:

“Update is a bit late this week — I got stuck on the substitution inversion theorem and wanted to work it through before writing. I have a good idea now.”

This signals that the delay produced something concrete.

On “weakly positive subtyping”:

“Recall that we are defining for intersection types (without BCD) the weakly positive subtyping that requires the extended weakly positive restriction.”

Slightly tangled. Consider:

“Recall that we are defining weakly positive subtyping for intersection types (without BCD), which requires extending the weakly positive restriction.”

On the variable set motivation:

“The new restriction tracks a set of variables instead of one as originally formulated, because we need to ensure all variables follow the same intersection subtyping rule.”

The “follow the same intersection subtyping rule” is a bit opaque. Consider:

“The new restriction tracks a set of variables rather than a single variable, because in the presence of intersection types, multiple variables may need to be uniformly tracked across the same derivation tree.”

On “richer check”:

“the single free variable check to a richer check that a set of variables are not actively compared in the mutual subtyping ”

“actively compared” is exactly the right phrase, but consider tightening:

“the single free-variable check is replaced by the condition that a set of variables does not actively participate in the subtyping comparison ”

On the observation about B ~ B:

“At this point, I observe that B ~ B \ S ∪ {X} (or simply, B <: B \ S ∪ {X}) is a very good specification itself!”

The exclamation point reads a bit casual. Consider:

“Here it’s worth noting that B ~ B \ S ∪ {X} — equivalently, B <: B \ S ∪ {X} — is a meaningful specification in its own right.”

On the example:

“For example, B can contain X, B = (X -> Top) & (Top -> Top), but it is possible to conclude (X -> Top) & (Top -> Top) <: (X -> Top) & (Top -> Top) without going through X (by Top -> Top <: X -> Top)”

Slightly clunky. Try:

“For example, with B = (X -> Top) & (Top -> Top), the variable X appears free in B, yet B <: B \ S ∪ {X} is still derivable: the comparison goes through Top -> Top <: X -> Top (which absorbs X contravariantly under Top) rather than directly comparing X to itself.”

On the rescue:

“The rescue here is that, for the cross-branch cases, we can also construct a smaller sub-derivation via transitivity through the cross-branch fact, then recurse with IH on the other branch’s self-equivalence.”

Good content; the prose is dense. Consider splitting:

“The rescue is to handle the cross-branch cases differently: we construct a smaller sub-derivation via transitivity through the cross-branch fact, then recurse with the IH on the other branch’s self-equivalence.”

Closing list:

“After resolving this final lemma, I believe it would be possible to pop up the problem stack and get the”

The sentence trails off (“get the” is followed by a list). Either complete the sentence or replace with a colon:

“After resolving this final lemma, I expect to pop the proof stack and complete the chain:”

On the final bullet:

“We get a complete metatheory for weakly positive subtyping and can extend to distributivity (maybe then the first step would be to try BCD without recursive rule, which should be easier because it simply mounts to generalize the existing definition of ordinary and splittable)”

“mounts to” should probably be “amounts to”. Also:

“We get a complete metatheory for weakly positive subtyping, and can extend to distributivity. A natural first step there would be BCD without the recursive rule, which should be easier — it amounts to generalizing the existing definitions of ordinary and splittable.”

Let me know if you want me to write a fully revised version of the document with all changes integrated, or if the spot suggestions are enough.

You

I will send the main contents as a separate md file, can you suggest where to put and write a several sentence slack message to Bruno along with this file?

Claude

Slack Message Suggestion

Here’s a short message to send along with the file:

Hi Bruno, here’s my weekly update — sending as an attached note since it’s a bit detailed. Short version: I got past the stuck point on the substitution inversion theorem. The key was reformulating the auxiliary lifting lemma with dual goals proved simultaneously, which handles the contravariance flip cleanly, and recognizing that B ~ B \ S ∪ {X} (extended self-equivalence) is the right specification for the absorption pattern. There’s one remaining simplification (assuming intersection components are ordinary) that I’m working on now via a cycle/pigeonhole argument over the flattened intersection. Details in the note. Happy to discuss whenever convenient.

On where to put the file

A few options depending on your usual workflow with Bruno:

Attach directly in Slack — simplest if Bruno reads updates inline. Just drop the .md file into the message.
Drop into a shared workspace folder — if you have a shared Dropbox / Google Drive / OneDrive folder where you keep weekly updates or research notes, place it there and link in the Slack message. Good for long-term reference.
Commit to your project repository — if your weekly updates live alongside the formalization (e.g., in a notes/ or weekly/ directory), commit it there and link or mention the path. This keeps the update co-located with the code you’re discussing.

If you don’t already have a convention with Bruno, I’d suggest option 1 (Slack attachment) for simplicity, with the message above as the wrapper. If he prefers a more durable record, option 2 or 3 is better — and you can adjust the message to point to the file location instead of attaching:

…Details in notes/2025-05-08-substitution-inversion.md (or wherever).

Let me know if you’d like the message tuned differently — shorter, more casual, or with different emphasis.

You

let’s get back to the proof of Lemma2’

k: nat
IHk: forall (A B: typ) (S: atoms), size_typ A + size_typ B < k → forall X: atom, X notin S → ssub_disjoint A B S → (ssub_disjoint A A (add X S) → ssub_disjoint A B (add X S)) /\ (ssub_disjoint B B (add X S) → ssub_disjoint A B (add X S))
A: typ
S: atoms
B1, B2: typ
Hsiz: size_typ A + Datatypes.S (size_typ B1 + size_typ B2) < Datatypes.S k
X: atom
Hnin: X notin S
Heq: ssub_disjoint (typ_and B1 B2) (typ_and B1 B2) (add X S)
Hsub1: ssub_disjoint A B1 S
Hsub2: ssub_disjoint A B2 S
Heq1: ssub_disjoint (typ_and B1 B2) B1 (add X S)
Heq2: ssub_disjoint (typ_and B1 B2) B2 (add X S)
Heq1b: ssub_disjoint B2 B1 (add X S)
Heq2a: ssub_disjoint B1 B2 (add X S) (1 / 1) ssub_disjoint A B1 (add X S)

What about this case?

- (* A <: B1 & B2 *) forwards (Heq1 & Heq2): ssub_andr_inv Heq. apply ssub_d_and. + forwards [Heq1a|Heq1b]: ssub_andl_inv Heq1; simpl in *; auto. { admit. } (* TODO: generalize to nested ord/and *) * (* B1 <: B1 \ S∪{X} *) forwards*: IHk X Hsub1. { lia. } * (* B2 <: B1 \ S∪{X} *) forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto. { admit. } ** (* Heq2b case *)

Claude

Analyzing This Case

Let me parse the proof state carefully. You’re in the ssub_d_and case for goal (2) — I think actually wait, let me re-check. Looking at the script:

(* A <: B1 & B2 *)
forwards (Heq1 & Heq2): ssub_andr_inv Heq.
apply ssub_d_and.

You’re proving A <: B1 & B2 \ S ∪ {X} and split it via ssub_d_and into two subgoals: A <: B1 \ S ∪ {X} and A <: B2 \ S ∪ {X}. The current goal is the first: A <: B1 \ S ∪ {X}.

So you have:

Hsub1 : A <: B1 \ S (the relevant sub-derivation for this subgoal)
Hsub2 : A <: B2 \ S (for the other subgoal, deferred)
Heq : (B1 & B2) <: (B1 & B2) \ S ∪ {X} (premise 1, self-eq of the RHS intersection)
Heq1 : (B1 & B2) <: B1 \ S ∪ {X} (from inversion of Heq)
Heq2 : (B1 & B2) <: B2 \ S ∪ {X} (from inversion of Heq)
Heq1b : B2 <: B1 \ S ∪ {X} (cross-branch from Heq1 -via-andR)
Heq2a : B1 <: B2 \ S ∪ {X} (cross-branch from Heq2 -via-andL)

Goal: A <: B1 \ S ∪ {X}.

Wait, this is interesting. We have both cross-branch facts: B2 <: B1 and B1 <: B2. By transitivity:

B1 <: B2 <: B1 \ S ∪ {X} gives B1 <: B1 \ S ∪ {X} — B1 ‘s self-eq!
B2 <: B1 <: B2 \ S ∪ {X} gives B2 <: B2 \ S ∪ {X} — B2 ‘s self-eq!

Both branches have self-eq via the bidirectional cross-branch chain. So this case is actually nice.

Closing the case

For the current goal A <: B1 \ S ∪ {X}:

Derive B1 <: B1 \ S ∪ {X} by transitivity of Heq2a and Heq1b.
Apply IH (2) to Hsub1 : A <: B1 \ S with B1 ‘s self-eq: get A <: B1 \ S ∪ {X}. ✓

Concretely in Coq:

** (* Heq2b case: bidirectional cross-branch *)
  forwards Hself1: ssub_disjoint_trans Heq2a Heq1b. { apply Nat.lt_succ_diag_r. }
  (* Hself1 : B1 <: B1 \ S ∪ {X} *)
  forwards (IH1 & IH2): IHk X Hsub1. { lia. }
  forwards: IH2 Hself1.
  (* A <: B1 \ S ∪ {X} *)

Wait actually I want to double-check the labeling. You have Heq2a with comment “B1 <: B2 \ S ∪ {X}” — let me re-derive.

Heq2 : (B1 & B2) <: B2 \ S ∪ {X}. By ssub_andl_inv (with B2 ordinary), this gives either:

Heq2a (via-andL): B1 <: B2 \ S ∪ {X} — cross-branch.
Heq2b (via-andR): B2 <: B2 \ S ∪ {X} — direct self-eq.

You’re in the Heq2a branch (via-andL), which gives B1 <: B2 \ S ∪ {X}. ✓ matches the hypothesis.

And Heq1b : B2 <: B1 \ S ∪ {X} from ssub_andl_inv Heq1 via-andR. ✓

So Heq2a : B1 <: B2 and Heq1b : B2 <: B1, both in \ S ∪ {X}. By transitivity, B1 <: B1 \ S ∪ {X}. Done.

What about the symmetric subgoal?

Once you apply ssub_d_and, you have two subgoals: A <: B1 \ S ∪ {X} (current) and A <: B2 \ S ∪ {X}. The structure of the second is symmetric — same hypotheses, just swap B1/B2 roles. The same bidirectional cross-branch fact gives B2 ‘s self-eq, then IH (2) on Hsub2.

Actually wait — when you apply ssub_d_and, the case analysis on Heq1 and Heq2 happens inside one of the subgoals. So when you finish the current Heq1b + Heq2a subcase, you’ll need to handle the same case for the other subgoal too, but the case structure repeats.

Better approach: pre-compute all the cross-branch information before splitting via ssub_d_and, so you don’t redo the case analysis. Something like:

- (* A <: B1 & B2 *)
  forwards (Heq1 & Heq2): ssub_andr_inv Heq.
  forwards [Heq1a|Heq1b]: ssub_andl_inv Heq1; simpl in *; auto.
  { admit. } (* B1 non-ordinary *)
  forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto.
  { admit. } (* B2 non-ordinary *)
  
  (* Now four subcases: derive self-eq for B1 and B2 in each, then split via ssub_d_and *)
  + (* Heq1a + Heq2a: B1 self-eq direct, B1 <: B2 cross *)
    (* B1 self-eq: Heq1a directly *)
    (* B2 self-eq: ??? need another route *)
    ...
  + (* Heq1a + Heq2b: B1 self-eq direct, B2 self-eq direct *)
    ...
  + (* Heq1b + Heq2a: B2 <: B1 cross, B1 <: B2 cross *)
    (* B1 self-eq: trans of Heq2a and Heq1b *)
    (* B2 self-eq: trans of Heq1b and Heq2a *)
    ...
  + (* Heq1b + Heq2b: B2 <: B1 cross, B2 self-eq direct *)
    (* B1 self-eq: ??? *)
    ...

Hmm, the four subcases aren’t all symmetric. Let me work through each.

Subcase Heq1a + Heq2a: B1 <: B1 direct (from Heq1a), B1 <: B2 cross (from Heq2a). So B1 has self-eq directly. For B2, we’d need B2 <: B2 \ S ∪ {X} — do we have it? Not directly. Can we derive it?

We have B1 <: B1 and B1 <: B2. By transitivity, B1 <: B2. We need B2 <: B2. Hmm, we don’t have B2 <: B1 in this subcase. So we don’t have B2 ‘s self-eq directly.

Wait, but premise 1 (B1 & B2) <: (B1 & B2) \ S ∪ {X} should imply both B1 ‘s and B2 ‘s self-eq somehow… or does it?

Going back to my counterexample: B1 = X -> Top, B2 = Top -> Top. We have B1 & B2 ~ B1 & B2 \ S ∪ {X}, and B2 ~ B2 \ S ∪ {X} (no X in B2), but B1 ~ B1 \ S ∪ {X} fails.

In this counterexample:

Heq1: (B1 & B2) <: B1 \ S ∪ {X} = (X -> Top) & (Top -> Top) <: X -> Top \ S ∪ {X}. Try ssub_d_andL: X -> Top <: X -> Top blocked. Try ssub_d_andR: Top -> Top <: X -> Top ✓ via Top absorption. So Heq1 -via-andR, giving Heq1b : B2 <: B1.
Heq2: (B1 & B2) <: B2 \ S ∪ {X} = ... <: Top -> Top \ S ∪ {X}. Try ssub_d_andL: X -> Top <: Top -> Top requires Top <: X, fails. Try ssub_d_andR: Top -> Top <: Top -> Top ✓. So Heq2 -via-andR, giving Heq2b : B2 <: B2.

So this counterexample lands in subcase Heq1b + Heq2b, not Heq1a + Heq2a.

What about Heq1a + Heq2a? B1 <: B1 direct and B1 <: B2 cross. Concrete instance: B1 = Top -> Top, B2 = X -> Top. Then:

Heq1: (Top -> Top) & (X -> Top) <: Top -> Top \ S ∪ {X}. ssub_d_andL: Top -> Top <: Top -> Top ✓. So Heq1a.
Heq2: (Top -> Top) & (X -> Top) <: X -> Top \ S ∪ {X}. ssub_d_andL: Top -> Top <: X -> Top ✓ (via Top absorption contravariantly). So Heq2a (B1 <: B2).

So Heq1a + Heq2a does happen. In this case, B2 = X -> Top does not have self-eq in \ S ∪ {X}. We can’t derive it.

So for this subgoal A <: B2 \ S ∪ {X}, we need a different approach.

The way out: in subcase Heq1a + Heq2a, we have B1 <: B2 \ S ∪ {X} (from Heq2a). And we have Hsub1 : A <: B1 \ S. By avoidance-set monotonicity in reverse… hmm wait, we want to lift to \ S ∪ {X}, not down.

Different chain: derive A <: B2 \ S ∪ {X} via:

A <: B1 \ S ∪ {X} (current subgoal-companion, derivable using Heq1a self-eq).
B1 <: B2 \ S ∪ {X} (from Heq2a).
By transitivity in \ S ∪ {X}: A <: B2 \ S ∪ {X}. ✓

So the second subgoal (after apply ssub_d_and) is derived not by IH directly but by transitivity from the first subgoal (already established) and Heq2a.

This means the order of subgoals matters: handle B1 (using Heq1a for self-eq) first, then use that result + Heq2a to get the second.

Symmetrically, in subcase Heq1b + Heq2b, B2 ‘s self-eq is direct (from Heq2b), so handle B2 first, then transitivity with Heq1b to get the first.

Refactoring suggestion

Maybe don’t apply ssub_d_and immediately; instead, derive the easier subgoal first and use transitivity for the other:

- (* A <: B1 & B2 *)
  forwards (Heq1 & Heq2): ssub_andr_inv Heq.
  forwards [Heq1a|Heq1b]: ssub_andl_inv Heq1; simpl in *; auto.
  { admit. }
  + (* Heq1a: B1 <: B1 \ S ∪ {X} directly *)
    forwards (IH1 & IH2): IHk X Hsub1. { lia. }
    forwards Hgoal1: IH2 Heq1a. (* A <: B1 \ S ∪ {X} *)
    forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto.
    { admit. }
    * (* Heq2a: B1 <: B2 \ S ∪ {X}, transitivity *)
      forwards Hgoal2: ssub_disjoint_trans Hgoal1 Heq2a. { ... }
      apply ssub_d_and; assumption.
    * (* Heq2b: B2 <: B2 \ S ∪ {X} directly *)
      forwards (IH1' & IH2'): IHk X Hsub2. { lia. }
      forwards Hgoal2: IH2' Heq2b.
      apply ssub_d_and; assumption.
  + (* Heq1b: B2 <: B1 \ S ∪ {X} cross *)
    forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto.
    { admit. }
    * (* Heq2a: B1 <: B2 \ S ∪ {X} bidirectional cross *)
      (* B1 self-eq: trans Heq2a Heq1b *)
      forwards Hself1: ssub_disjoint_trans Heq2a Heq1b. { ... }
      forwards (IH1 & IH2): IHk X Hsub1. { lia. }
      forwards Hgoal1: IH2 Hself1.
      (* B2 self-eq: trans Heq1b Heq2a *)
      forwards Hself2: ssub_disjoint_trans Heq1b Heq2a. { ... }
      forwards (IH1' & IH2'): IHk X Hsub2. { lia. }
      forwards Hgoal2: IH2' Hself2.
      apply ssub_d_and; assumption.
    * (* Heq2b: B2 <: B2 \ S ∪ {X} directly *)
      forwards (IH1' & IH2'): IHk X Hsub2. { lia. }
      forwards Hgoal2: IH2' Heq2b. (* A <: B2 \ S ∪ {X} *)
      (* For Hgoal1: A <: B2 <: B1 by transitivity *)
      forwards Hgoal1: ssub_disjoint_trans Hgoal2 Heq1b. { ... }
      apply ssub_d_and; assumption.

Summary of the four subcases

Subcase	`B1` self-eq	`B2` self-eq	Strategy
Heq1a + Heq2a	direct (`Heq1a`)	not direct	First subgoal via IH on `B1`; second via transitivity through `Heq2a`
Heq1a + Heq2b	direct (`Heq1a`)	direct (`Heq2b`)	Both subgoals via IH directly
Heq1b + Heq2a	bidirectional cross	bidirectional cross	Both self-eqs via transitivity around the cycle, then both via IH
Heq1b + Heq2b	not direct	direct (`Heq2b`)	First subgoal via transitivity through `Heq1b`; second via IH on `B2`

So all four subcases close, but each uses a slightly different combination of IH and transitivity. The key insight is that whenever a branch lacks direct self-eq, the cross-branch fact lets us route through the other branch via transitivity.

To answer your specific question

For the case you’re stuck on (Heq1b + Heq2a is the bidirectional cross subcase, but you wrote “Heq2b case” in the comment — let me re-check your script).

Looking again at your script:

* (* B2 <: B1 \ S∪{X} *)
  forwards [Heq2a|Heq2b]: ssub_andl_inv Heq2; simpl in *; auto.
  { admit. }
** (* Heq2b case *)

Hmm, you wrote ** (* Heq2b case *). But the proof state shows hypothesis Heq2a : B1 <: B2 \ S ∪ {X} (not Heq2b). Let me look at the proof state again:

Heq1b : B2 <: B1 \ S ∪ {X}
Heq2a : B1 <: B2 \ S ∪ {X}

OK the proof state has Heq2a and Heq1b. So this is the bidirectional cross subcase (Heq1b + Heq2a in my table above), and your comment “Heq2b case” might just be a miscomment in the script.

In this subcase, B1 ‘s self-eq comes from transitivity of Heq2a and Heq1b:

** (* Heq1b + Heq2a: bidirectional cross *)
  forwards Hself1: ssub_disjoint_trans Heq2a Heq1b. { apply Nat.lt_succ_diag_r. }
  (* Hself1 : B1 <: B1 \ S ∪ {X} *)
  forwards (IH1 & IH2): IHk X Hsub1. { lia. }
  forwards: IH2 Hself1.
  (* A <: B1 \ S ∪ {X} *)

This should close the current subgoal.

Caveats

I want to flag a few things to verify:

Order of arguments to transitivity: I wrote ssub_disjoint_trans Heq2a Heq1b to chain B1 <: B2 <: B1, giving B1 <: B1. Make sure your ssub_disjoint_trans takes arguments in the right order — some libraries use trans : A <: B -> B <: C -> A <: C, which would mean the first argument is the “left half” of the chain. With Heq2a : B1 <: B2 and Heq1b : B2 <: B1, this composition gives B1 <: B1. ✓
Size constraint on ssub_disjoint_trans: Your invocation uses apply Nat.lt_succ_diag_r. for the size argument. Verify this matches the way transitivity is stated in your library (e.g., whether it’s strong induction or some explicit budget).
The four subcases I tabulated — I want to be honest that I haven’t traced through every case in your script line by line, just sketched the structure. The Heq1a + Heq2a and Heq1b + Heq2b subcases use transitivity across the two ssub_d_and subgoals (deriving one from the other), which is cleaner if you don’t apply ssub_d_and upfront. Your current script does apply it upfront, which forces handling each subgoal independently — this works for Heq1b + Heq2a (current case) but might be awkward for the asymmetric subcases.

Let me know if any of these need clarification.

Problem Session Brianstorming

Tony's Wiki

Explorer

Subtyping inversion with intersection types and nominal labels

Analysis of the Inversion Property with Intersection Types

Setting up the problem

The stuck case, restated

My analysis: this is not a counterexample, but the lemma needs strengthening

A concrete instance of the stuck pattern

The key observation

Why subcase (3b) is the heart of the matter

A reformulated lemma

The reformulation I’d suggest

Concrete suggestions for moving forward

Questions I have for you

Refined Analysis

Revisiting the assumptions

A counterexample candidate I want to put on the table

Now let me really try to break it

The real counterexample attempt

The structural reason the lemma should hold

Sharpening the argument

Where this leaves us

Concrete next step I’d recommend

Questions back to you

Working Through Your Questions

Question 1: Does the semantic induction go through?

Concrete instance to check

Trying to break it: A₂ with active X

Question 2: Avoidance set as a meta-theoretic property

Question 3: Reflexivity

Question 4: The avoidance set is not equivalent to a context

Summary and recommendation

Re-examining Lemma 1

The example you raise

The general question

Attempt to break it

The deeper concern

Refining the worry: what if the absorbing structure is shared across branches?

The genuinely worrying case

Verdict on Lemma 1

However — a remaining concern

So I retract some worry but not all

A residual worry I haven’t fully resolved

Generalizing to Multiple Avoided Variables

Setting up the multi-variable statement

Does Lemma 1 generalize?

What I want to call your attention to: the disjunctive branch is more complex

Does the inductive argument go through?

The new wrinkle in multi-variable

Re-examining Lemma 2 in the multi-variable case

Where I think the genuine complication arises

So does the multi-variable lemma go through?

Summary

Genuine uncertainty I want to flag

The Right Statement of the Lemma

Restating the lemma

The structure of the proof

The proof attempt

Where the proof needs Lemma-1-style transitivity

The remaining gap

The honest answer: I don’t see how to close this gap with Lemma 1 alone

Where this leaves us

Trying for a counterexample

So we do need something like Lemma 2 after all

My current honest position

An Important Observation About Proof Structure

The issue you’re raising

Why this matters concretely

What can I extract semantically?

Re-examining the case structure

Can we apply IH to (A₁, B)?

How might we handle this?

A concrete instance of case (c)

Why case (c) might be rare or impossible

What I think this means

Returning to your point

Question back to you

Where Transitivity Helps and Where It Doesn’t

Setting up exactly what we have

Where transitivity does help