can someone help me with this proof about Uplift Modeling General Metric?

I am currently trying to implement this paper : Reinforcement Learning for Uplift Modeling
I have skimmed through the paper have intuitive idea of the process they are describing.
but am struggling with the 2.2 Uplift Modeling General Metric part. could someone have a look at it and help me understand the thought process?
I am struggling to understand the Lemma 1. would greatly appreciate some help over there.
just wanted to understand the maths behind the proof in detail:


Break down each term and step and point out where exactly you are stuck or confused.

If you are unsure about any or all of it, then you might have to go through some of the basics of RL and Probability & Statistics perhaps.

Do note that there is no guarantee that you’ll be able to get the exact help in these forums since there isn’t any focus on RL through dataquest.

Hey thanks for pointing that out.

I am unable to understand the expansion of estimate calculation.I read about inverse propensity scoring and about the importance sampling.
And I understand the basic probability and RL. It is just a bit confusing how they have proven the the two estimates to be equal. I asked a teammate and her reply was that the math is wrong.

So was wondering if someone with better math skills could comment on something which I might have missed?

Well, this is currently outside of my complete understanding. This does come under some of the topics I am self-studying these days. So, it’s going to be quite a bit of time before I can confirm on this.

But, based on some quick look around, it’s, as per me, essentially breaking things down into -

  • Conditional Expectations
  • Baye’s Rule for Random Variables

I think based on the above two, you might be able to go through that proof.

That’s the best I can offer at this point and I might be wrong. That’s currently outside my scope, as I mentioned above.

I hope he doesn’t mind (and I might be wrong again), but @Bruno might be someone who is interested in this because of the math of it, so I am tagging him here to see if he could share some insight. But I would request you (@architrajendrakhare) to not bother him directly/indirectly about this if he doesn’t respond.

I know you have posted a similar query in some other forums/communities as well, so hopefully you get a better response there.