Here we prove that without touching calculus (the most common proof interprets as a Riemann sum of ). Seeing as this is probably the most fundamental asymptotic for computer scientists, I was surprised to see this Stack Exchange question didn’t mention the calculus-free approach. The proof here pretty much matches my own answer to the Stack Exchange post.

We first consider powers of 2, . For these , we can break up into the “chunks”

and we have one extra term . There are chunks, and of course is the sum of these chunks (plus the extra term), hence to show we show each chunk is . Inside each chunk we bound above by the power of 2:

Thus each chunk is at most 1, and taken together with the extra term , we have .

On the other hand, if we lower bound* *the elements by the next power of 2,

Every chunk is at least , hence .

This essentially is the proof. One technicality: we’ve only shown for a power of 2. Monotonicity of completes the proof for all : taking the nearest powers of 2 above and below , call them and ,

Applying our bounds on and ,

Thus .

(In general, for any monotonic linear or sublinear function, it suffices to show on only the powers of 2).