This is a reasonable question, but I think you've made some wrong assumptions. T...

This is a reasonable question, but I think you've made some wrong assumptions.

Typically, the CDN nodes will be within your ISPs network, typically on the way to the PoP (or at the PoP). You're unlikely to see a significantly different round trip time if your request is proxied through the CDN vs going to the remote datacenter directly. But, even if your total round trip time is a little longer (10ms in your hypothetical), you get benefits from local TCP and/or TLS termination.

At ~ 100ms latency, your effective bandwidth is usually limited more by round trip than actual connection bandwidth, because of congestion control / slow start. A 1 MB image is going to be somewhere around 700 packets with ~ 1500 MTU. Assuming standard congestion control, the initial congestion window is 10 packets, for each packet acked, the server can send two packets. Assuming you and the server and the path between have infinite bandwidth and a large enough receive window, the client wound receive 10 packets at t=100ms after the initial request, then 20 additional (total 30) at t=200ms, ... 320 additional (630 total) at t=600ms, and the remainder at t=700ms. If your effective bandwidth is less than about 40 Mbps, you're going to hit congestion in the 6th bunch of packets, but any connection speed above that and your 1 MB transfer is going to take 700 ms. If you've got a much smaller MTU, you might need more round trips; and if you've got a larger MTU, you could end up with less round trips, but word on the street is inter-AS jumbo packets is rare, and Path MTU Detection isn't great, so lots of servers force an effective max MTU of 1500 or less, because it's easier to send smaller packets to everyone than to fix path mtu issues.

But if we have your steps of 10 ms to the CDN, 10ms to the PoP, 90 ms to the datacenter, and let's assume the transit time between you and the CDN and you and the PoP is symmetric, we get

Your request starts at t=0, CDN to PoP request starts at t=5 ms, PoP to DC request starts at t=10ms, then the PoP to DC request takes 7 round trips = 630 ms for all data, and you get that 10ms later at 640 ms. Your first byte of response is delayed by 10 ms, but it's worth while because time to last byte decreases by 60 ms. If the transit times between hops are asymmetric, the times that the client gets data don't change, but the math makes my head spin more.

If you change it up, and say you're 50 ms round trip to the CDN, which is 50 ms round trip to the PoP, which is 50 ms to the DC, first byte jumps to t=150 ms, but time to last byte would be 7 * 50 ms + 100 ms = 450 ms.

And, if the PoP -> DC connection happens over a warm socket, with an appropriate congestion window and receive window, the transfer from the DC to the PoP can happen much faster. Certainly, you could potentially have a warm connection to the data center from the client as well, but most services don't want to have millions or billions of client connections to all their datacenters, so running them through PoPs or CDNs can be pretty handy.

There's some handwaving here (I ignored processing time at each hop, but it's usually pretty low), but it's really helpful to process congestion control closer to the user. Any lost packets can be resent sooner for much quicker recovery if managed locally as well. And for things that are cachable, read-through caching with a CDN -> PoP -> Datacenter approach makes a lot of sense to reduce demand on the Datacenter, and benefiting from likely locality of reference --- people in the same area / on the same ISP are likely to fetch images that others in their area have fetched.