A modern CPU operates at a clock speed of 2.5GHz. That is exactly 524.11 times faster than the original 4.77MHz 8086 processor released in the 1970s and even though the processors of today can run the programs written on those old machines. The processors actually compute data way faster than 524x the speed of those old CPUs.
The reason is the CPUs use a different internal design. The same programs are able to be executed, but the way in which those programs are executed has changed. This is due to the differences inside of the chip. Nothing outside has changed, and copies of those old programs haven't changed, but it's the internal workings of the CPU that make the difference.
For a processor to process data it must take the workload, break it out into smaller stages and divide it into smaller workable units. This breakdown also happens to create an environment where more than one thing at a time can be accomplished. where instructions from the computer program come through in the order A, B, C, D, but might well be executed by the core in the order D, B, A, C. In reality, A is broken out into stages such as A.1, A.2, A.3, A.4, and so on. It may actually be such that D.1, D.2, and D.3 are executed before B.1 ever is, and A.1 can be executed before that but A.2 cannot be completed until all of D and B are completed, because A.2 depends on the results of D and B, and so.
Even though it's the ability to do all of this that makes an AMD-MHz different from an Intel-MHz. The number of things that can be done out of order the number of dot-units required to do it all and the speed at which each dot-unit can be processed or executed all add up to the different results we see in performance at a clock speed.
The numbers of stages modern processors break workloads into a certain number of things being done at a time. The way it works is like this the smaller the workload per dot-unit the faster it can be processed–because it's doing less actual work per clock.
Think of it like carrying water from a faucet to a small pond. In one instance you have 12 buckets that each hold a gallon of water. If you fill the buckets, move the buckets, and empty the buckets, you've moved 12 gallons of water in one trip. If the pond is 120 gallons, it will take 10 trips.
For the Intel chip, you have either 20 or 31 buckets, but each is smaller, and you'll be moving 20 or 31 buckets each trip. The result is you have to fill more buckets and do more work each trip to get the same workload accomplished, since it takes more effort to move a bucket into place in front of the faucet, let it fill, then move it out of place, grab another bucket, put it in front of the faucet, and so on for 20 or 31 smaller buckets than it does to do the same with 12 larger buckets. It still takes 10 trips to fill the pond you would be doing more work each trip. But because you're using less water in each bucket each bucket is filled more quickly than the gallon-sized buckets. It comes down to a design tradeoff.