Hardware Forum / CPU / AMD 64 bit / October 2008
X2 vs X4
|
|
Thread rating:  |
Dave - 18 Sep 2008 19:48 GMT If an application does NOT support multiple CPU cores, will it run slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU?
I currently have a 2.2 GHz X2 and I want to upgrade it. My motherboard supports the Phemon X4 but from what I'm reading, software that doesn't support multiple cores may run slower if I do.
Any advice?
Bill - 18 Sep 2008 22:18 GMT > If an application does NOT support multiple CPU cores, will it run > slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU? [quoted text clipped - 4 lines] > > Any advice? Why would you think that software that runs on a 2.2GHz multiprocessor cpu run slower on a multiprocessor cpu that's 200MHz faster?
Bill
 Signature GMail & Google Goobers. This century's answer to AOL and WebTV.
Zootal - 18 Sep 2008 23:54 GMT Multithreading cpus can make some software slow down. Mulit-core cpus will not unless the speed of a core itself is slower. Then, the slowdown is caused by the slower core, not by the fact that it's a multi core cpu.
OTOH....let's stop and think a bit. If I have a multi-core cpu with non-shared caches, then I now have cache coherency issues to deal with if the cpu scheduler for some reason moves my task to a different core. Any cache lines I try to access that aren't in the current cache will have to be copied from the cache it resides in, or from memory if it's no longer in any cache. So maybe the answer to the question of performance is "it depends"?
Where did you read that software that doesn't support multiple cores may run slower?
> If an application does NOT support multiple CPU cores, will it run > slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU? [quoted text clipped - 4 lines] > > Any advice? Dave - 19 Sep 2008 13:16 GMT > Where did you read that software that doesn't support multiple cores may run > slower? A thread on CraigsList a few days ago. Several people were discussing performance issues and stated that software that does not support multiple cores runs slower on a multi-core CPU than on a non-multi-core CPU. Nobody disagreed with that statement in the thread.
Scott Lurndal - 19 Sep 2008 18:13 GMT >> Where did you read that software that doesn't support multiple cores may run >> slower? [quoted text clipped - 3 lines] >multiple cores runs slower on a multi-core CPU than on a non-multi-core >CPU. Nobody disagreed with that statement in the thread. I'd not consider craigslist to be a top technical forum.
Given identical clock speeds and voltages, a single-threaded application will perform equally on a single core or a multi-core box. The multi-core box will, of course, be able to run multiple copies of the single-threaded application much faster than the single core box.
When one considers that a typical operating system often has dozens of processes running other than the "foreground application", an application on a multicore system _may_ perform better, because the operating system processes can run on the other core freeing capacity for the application. Now, this really only holds if the application is using 80% or more of the processor (e.g. mp3 encoders, video transcoders, numerical analysis applications, etc). Most graphical applications seldom use significant amounts of processing power.
scott
Bill - 20 Sep 2008 00:36 GMT > > Where did you read that software that doesn't support multiple cores may run > > slower? [quoted text clipped - 3 lines] > multiple cores runs slower on a multi-core CPU than on a non-multi-core > CPU. Nobody disagreed with that statement in the thread. Scott is correct. And the last place I'd go to for technical information is Craigs list.
Try reading some of the papers available at AMD web site instead of relying on fourth hand hearsay. And yes I include anything I say as well. You have no idea that what's given out on Usenet is the 'real deal' unless you have some background knowledge of what's going on or have seen that the poster your reading has a history of giving good/correct information. A guy named Paul comes to mind.
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_ 118,00.html
Bill
 Signature GMail & Google Goobers. This century's answer to AOL and WebTV.
Ed Light - 19 Sep 2008 01:14 GMT One core of a Phenom will run a little faster than one core of an X2, given the same clock speed.
 Signature Ed Light
Better World News TV Channel: http://realnews.com
Bring the Troops Home: http://bringthemhomenow.org http://antiwar.com
Iraq Veterans Against the War: http://ivaw.org http://couragetoresist.org
Send spam to the FTC at spam@uce.gov Thanks, robots.
Dave - 19 Sep 2008 13:17 GMT > One core of a Phenom will run a little faster than one core of an X2, > given the same clock speed. Thanks you SOO much. That is great to know.
--Dave
Dave Feustel - 19 Sep 2008 14:24 GMT The effective clock speed of a single core in a multiple core chip is the chip clock speed divided by the number of cores, so an application running on a single core in a multicore chip will run slower than the same app running on a single core cpu. BUT in a multicore cpu the app will experience fewer task switches for interrupts, etc because there are other cores to run the interrupts, etc on. Since each core has its own set of registers, less time is spent saving and restoring register data, of which there is a lot on X64 cores. So whether a single-threaded app runs faster or slower on a multicore chip is a little hard to predict apriori.
Scott Lurndal - 19 Sep 2008 18:16 GMT >The effective clock speed of a single core in a multiple core chip is >the chip clock speed divided by the number of cores, This is incorrect.
All cores run at the same clock speed, which is the 'chip clock speed'. Of course the power-management capabilities of the processor allow the operating system to individually ramp-down the voltages and frequencies of each core to allow them to run slower (when idle), but the norm is for all cores to run at the same clock speed which is equal to (not a fraction of) the core clock speed.
So called SMT (aka Hyperthreading) is different, in that the secondary thread is leveraging otherwise idle execution and load/store resources on a single core.
scott
Zootal - 19 Sep 2008 19:39 GMT > So called SMT (aka Hyperthreading) is different, in that the secondary > thread is > leveraging otherwise idle execution and load/store resources on a single > core. > > scott I don't get this - what can hyperthreading do that a good cpu scheduler can't do? If I have two virtual cores, I have to have two schedulers running (one for each virtual cpu), each with their own set of queues and each with 50% cpu time. Is that more efficient then one single scheduler that has 100% cpu time?
Scott Lurndal - 19 Sep 2008 23:41 GMT >> So called SMT (aka Hyperthreading) is different, in that the secondary >> thread is [quoted text clipped - 5 lines] >I don't get this - what can hyperthreading do that a good cpu scheduler >can't do? Leverage otherwise idle resources in the core. A core typically has two or more integer ALU's and one or more floating point ALU's. These allow superscaler behaviour (i.e. multiple instructions can be in flight at the same time (multiple issue)). However, for many instruction streams, not all of the ALU's and FPU's are used, so a second 'logical' processor (the hyperthread) can be made available to the operating system to take advantage of those idle resources.
Note that even with HT/SMT, the operating system sees them as two distinct cores, even though they aren't really stand-alone cores.
A four physical core processor with SMT will appear to the operating system as 8 logical cores.
> If I have two virtual cores, I have to have two schedulers running >(one for each virtual cpu), each with their own set of queues and each with >50% cpu time. Is that more efficient then one single scheduler that has 100% >cpu time? There is only one scheduler in a typical operating system. It schedules across all logical cores and is typically NUMA and SMT aware in order to make optimal scheduling decisions. NUMA awareness means scheduling user threads/tasks on a CPU close to memory. SMT aware schedulers understand that resources are shared and attempt to schedule related threads (i.e. threads from the same process/job/task) on the secondary threads.
scott
DevilsPGD - 20 Sep 2008 00:25 GMT >I don't get this - what can hyperthreading do that a good cpu scheduler >can't do? If I have two virtual cores, I have to have two schedulers running >(one for each virtual cpu), each with their own set of queues and each with >50% cpu time. Is that more efficient then one single scheduler that has 100% >cpu time? The problem that Hyperthreading was designed to solve is that the P4 series has an extremely long pipeline.
In other words, it takes many cycles to get instructions to the CPU, and for the CPU to send instructions to pull data to/from memory or other hardware components.
Hyperthreading was designed to help/encourage existing OSes to schedule multiple threads/workloads so that the CPU can run them, from the OS' point of view, concurrently, rather then waiting for one workload to finish before sending another.
Zootal - 20 Sep 2008 01:40 GMT > In other words, it takes many cycles to get instructions to the CPU, and > for the CPU to send instructions to pull data to/from memory or other > hardware components. That isn't exactly correct - the long pipeline *is* the cpu, it just takes a lot of cycles to make it through the pipeline. In order to get the advertised clock speed, they had to make the pipeline longer. The P4 Prescott 3.8GHz has a 31 cycle pipeline.
Bill - 20 Sep 2008 00:40 GMT > The effective clock speed of a single core in a multiple core chip is > the chip clock speed divided by the number of cores, Have you got a cite for that?
<snip>
Bill
 Signature GMail & Google Goobers. This century's answer to AOL and WebTV.
Dave Feustel - 21 Sep 2008 14:21 GMT >> The effective clock speed of a single core in a multiple core chip is >> the chip clock speed divided by the number of cores, [quoted text clipped - 4 lines] > > Bill The person who told me this is Miles R***, a person who sells computers for a living. If the cores ran at the chip's nominal clock speed, a four-core chip would perform 4 times faster than a single core chip at the same clock speed, which they don't. And the power consumption would be much higher. So I think Miles is correct.
Miles Bader - 21 Sep 2008 14:35 GMT > The person who told me this is Miles R***, a person who sells computers > for a living. If the cores ran at the chip's nominal clock speed, a > four-core chip would perform 4 times faster than a single core chip at > the same clock speed, which they don't. And the power consumption would > be much higher. So I think Miles is correct. No, this is not correct.
Either you misinterpreted "Miles R***", or he is quite ignorant about his own product (or both).
-Miles
 Signature Genealogy, n. An account of one's descent from an ancestor who did not particularly care to trace his own.
Rodney Pont - 21 Sep 2008 15:13 GMT >The person who told me this is Miles R***, a person who sells computers >for a living. If the cores ran at the chip's nominal clock speed, a >four-core chip would perform 4 times faster than a single core chip at >the same clock speed, which they don't. And the power consumption would >be much higher. So I think Miles is correct. The four core chip can only run an application on all four cores if it's threaded and at least 4 threads have work that can be run simultaneously. Even in threaded applications this can't always happen unless the threads are doing something that doesn't depend on others, say converting a video file where each core can be given a section of the file to convert.
I can see how he came to the conclusion though if he ran a single threaded application and it ran four times slower than expected, since it ran on only one core. Get him to run four of them at the same time and they should complete in nearly the same time as one providing he isn't running anything else at that time.
As for power consumption my dual core chip uses 45 watts and the quad core version uses 95 watts. Taking into account the extra circuitry for the 4 cores it's about right.
 Signature Regards - Rodney Pont The from address exists but is mostly dumped, please send any emails to the address below e-mail ngpsm4 (at) infohitsystems (dot) ltd (dot) uk
Jim Beard - 21 Sep 2008 17:45 GMT >> The person who told me this is Miles R***, a person who sells computers >> for a living. If the cores ran at the chip's nominal clock speed, a [quoted text clipped - 18 lines] > core version uses 95 watts. Taking into account the extra circuitry for > the 4 cores it's about right. One must also bear in mind that a dual-core or quad-core CPU has to devote some processing time to deciding what to run on which core, when. This more intricate scheduling task routinely results in a process running on only one core running more slowly (all things included) than a process running on a single-core CPU that is lightly loaded.
Whatever the CPU speed is in GHz or MHz, all cores will work at that speed unless power management software readjusts the speed. That does not mean that all that speed is usable, though. You still have delays due to I/O requirements, scheduling delays, wait states, and a host of other bottlenecks, real and potential. My home computer is an AMD 64-bit 5000+ dual-core, and CPU usage typically is in the 1 to 3 percent range when I am not compiling or doing some other CPU-intensive task. This does not mean that all tasks complete instaneously nor that response time is zero (though it is very nice, I will admit).
Specifically with respect to X2 vs X4, the kernel scheduler will do a fairly good job of using two CPUs, but rarely does well with more than two unless the applications are specifically tailored for multi-CPU usage. Thus, the percentage gain in performance from shifting from single to dual-core cpu is likely to be significantly greater than the percentage gain from shifting from dual-core to quad-core, unless you have software tailored for the additional cores.
The big question, of course, is, are your applications CPU-intensive enough to make use of the available capacity, regardless of number of cores? If the computer is not heavily loaded at least part of the time, the answer is likely to be no.
Cheers!
jim b.
 Signature UNIX is not user unfriendly; it merely expects users to be computer-friendly.
Scott Lurndal - 21 Sep 2008 20:56 GMT >Specifically with respect to X2 vs X4, the kernel scheduler will do a >fairly good job of using two CPUs, but rarely does well with more >than two unless the applications are specifically tailored for maybe with respect to windows, but linux schedulers are O(1) over large numbers of cores.
scheduler overhead is pretty much non-existent.
scott
Zootal - 24 Sep 2008 22:06 GMT >>Specifically with respect to X2 vs X4, the kernel scheduler will do a >>fairly good job of using two CPUs, but rarely does well with more [quoted text clipped - 6 lines] > > scott Are you sure about that? Each cpu has its own set of runqueues. If I have 4 cpus, I have 4 sets of runqueues to manage, and 4 sets of runqueues to search. The runqueue itself can be searched for the next entry in O(1) time - this is where the O(1) comes from, because the amount of time it takes to find the next task in the queue is constant and not dependant by the number of tasks in the queue.
I would think that that the default linux scheduler is O(n) over large number of cores, where n = the number of cores.
Scott Lurndal - 24 Sep 2008 23:12 GMT >>>Specifically with respect to X2 vs X4, the kernel scheduler will do a >>>fairly good job of using two CPUs, but rarely does well with more [quoted text clipped - 16 lines] >I would think that that the default linux scheduler is O(n) over large >number of cores, where n = the number of cores. If you have a runqueue per core, then you simply schedule the next entry in the queue for each core. O(1). Remember that code is shared by all processors, and scheduling happens in-context - there is not a scheduler "thread" or "job" or "task" per se.
scott
symonds - 02 Oct 2008 12:36 GMT Quality of service enforcement - identifying different types or classe of packets and providing preferential treatment for some types o classes of packet at the expense of other types or classes of packet.
'Overcoming Fear of Flying (http://www.gogetterjetsetter.com/overcoming-fear-of-flying.php)
'stingray boots' (http://www.timsboots.com
-- symonds
Bill - 22 Sep 2008 00:17 GMT > >> The effective clock speed of a single core in a multiple core chip is > >> the chip clock speed divided by the number of cores, [quoted text clipped - 10 lines] > the same clock speed, which they don't. And the power consumption would > be much higher. So I think Miles is correct. You're entitled to your opinion, but as far as "The effective clock speed of a single core in a multiple core chip is the chip clock speed divided by the number of cores" is concerned Miles R*** is full of sh.t*, and you can tell him I said so.
You need to get to Intel's or AMD's website and do some reading.
Bill
 Signature GMail & Google Goobers. This century's answer to AOL and WebTV.
Richard P - 22 Sep 2008 00:28 GMT >>>> The effective clock speed of a single core in a multiple core chip is >>>> the chip clock speed divided by the number of cores, [quoted text clipped - 17 lines] > > Bill I have a X4 and each core at default is 2.5ghz.
DevilsPGD - 23 Sep 2008 07:14 GMT >The person who told me this is Miles R***, a person who sells computers >for a living. "Never trust someone trying to sell you something" comes to mind.
>If the cores ran at the chip's nominal clock speed, a >four-core chip would perform 4 times faster than a single core chip at >the same clock speed, which they don't. Depending on your task, a four-core CPU can perform reasonably close to four times the clock speed of a single core CPU. Unfortunately, few tasks parrallelize that well, and even less software takes full advantage of modern CPUs.
That being said, aside from some shady marketing in the past advertising dual CPU systems as double the clock speed of one CPU rather then advertising the actual configuration, each core runs at the full clock speed advertised.
Dave Feustel - 23 Sep 2008 13:44 GMT >>The person who told me this is Miles R***, a person who sells computers >>for a living. [quoted text clipped - 14 lines] > advertising the actual configuration, each core runs at the full clock >> speed advertised. So the 4 core chip cpu should run 4 independent identical tasks (compute pi to 1 million digits) in essentially the same time that a single core runs one instance of that task?
Richard P - 23 Sep 2008 18:07 GMT >>> The person who told me this is Miles R***, a person who sells computers >>> for a living. [quoted text clipped - 16 lines] > pi to 1 million digits) in essentially the same time that a single core > runs one instance of that task? Yes
DevilsPGD - 23 Sep 2008 20:39 GMT >> Depending on your task, a four-core CPU can perform reasonably close to >> four times the clock speed of a single core CPU. Unfortunately, few [quoted text clipped - 9 lines] >pi to 1 million digits) in essentially the same time that a single core >runs one instance of that task? More or less, yes. However, in the real world, not all tasks will scale quite this well as many tasks require not only CPU resources, but also other resources which may become starved before you load all four cores.
For something that can be done entirely on-chip, you'll get four times the performance using all four cores of a quad 2.4GHz CPU then a single core version of the same 2.4GHz CPU.
|
|
|