I’d appreciate any explanation of the entropy formula, in particular the logarithm part of it.

Why the multiplication of probability and log of probability gives us the metric of the ‘disorder’?

How do we chose the base of logarithm? It says we usually use 2 but it can be 10, what does it depend on?

Why is entropy negative, I mean there’s minus before the sum.

Oversimplified or intuitive explanation of the formula will do, just trying to understand what it actually means.