Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsPC HardwareCPUMotherboardsVideo CardsStorageNetworkingPeripheralsBrand Name Systems
Related Topics
Video GamesWindowsMS Server ProductsMS OfficeMore Topics ...

Hardware Forum / CPU / AMD 64 bit / October 2008

Tip: Looking for answers? Try searching our database.

X2 vs X4

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Dave - 18 Sep 2008 19:48 GMT
If an application does NOT support multiple CPU cores, will it run
slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU?

I currently have a 2.2 GHz X2 and I want to upgrade it.  My motherboard
supports the Phemon X4 but from what I'm reading, software that doesn't
support multiple cores may run slower if I do.

Any advice?
Bill - 18 Sep 2008 22:18 GMT
> If an application does NOT support multiple CPU cores, will it run
> slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU?
[quoted text clipped - 4 lines]
>
> Any advice?

Why would you think that software that runs on a 2.2GHz
multiprocessor cpu run slower on a multiprocessor cpu that's 200MHz
faster?

        Bill
Signature

GMail & Google Goobers.
This century's answer to AOL and WebTV.

Zootal - 18 Sep 2008 23:54 GMT
Multithreading cpus can make some software slow down. Mulit-core cpus will
not unless the speed of a core itself is slower. Then, the slowdown is
caused by the slower core, not by the fact that it's a multi core cpu.

OTOH....let's stop and think a bit. If I have a multi-core cpu with
non-shared caches, then I now have cache coherency issues to deal with if
the cpu scheduler for some reason moves my task to a different core. Any
cache lines I try to access that aren't in the current cache will have to be
copied from the cache it resides in, or from memory if it's no longer in any
cache. So maybe the answer to the question of performance is "it depends"?

Where did you read that software that doesn't support multiple cores may run
slower?

> If an application does NOT support multiple CPU cores, will it run
> slower on a Phenom 2.4 GHz CPU than it would on an X2 2.4 GHz CPU?
[quoted text clipped - 4 lines]
>
> Any advice?
Dave - 19 Sep 2008 13:16 GMT
> Where did you read that software that doesn't support multiple cores may run
> slower?

A thread on CraigsList a few days ago.  Several people were discussing
performance issues and stated that software that does not support
multiple cores runs slower on a multi-core CPU than on a non-multi-core
CPU.  Nobody disagreed with that statement in the thread.
Scott Lurndal - 19 Sep 2008 18:13 GMT
>> Where did you read that software that doesn't support multiple cores may run
>> slower?
[quoted text clipped - 3 lines]
>multiple cores runs slower on a multi-core CPU than on a non-multi-core
>CPU.  Nobody disagreed with that statement in the thread.

I'd not consider craigslist to be a top technical forum.

Given identical clock speeds and voltages, a single-threaded application
will perform equally on a single core or a multi-core box.   The multi-core
box will, of course, be able to run multiple copies of the single-threaded
application much faster than the single core box.

When one considers that a typical operating system often has dozens of
processes running other than the "foreground application", an application
on a multicore system _may_ perform better, because the operating system processes
can run on the other core freeing capacity for the application.   Now, this
really only holds if the application is using 80% or more of the processor (e.g.
mp3 encoders, video transcoders, numerical analysis applications, etc).  Most
graphical applications seldom use significant amounts of processing power.

scott
Bill - 20 Sep 2008 00:36 GMT
> > Where did you read that software that doesn't support multiple cores may run
> > slower?
[quoted text clipped - 3 lines]
> multiple cores runs slower on a multi-core CPU than on a non-multi-core
> CPU.  Nobody disagreed with that statement in the thread.

Scott is correct. And the last place I'd go to for technical
information is Craigs list.

Try reading some of the papers available at AMD web site instead of
relying on fourth hand hearsay. And yes I include anything I say as
well.  You have no idea that what's given out on Usenet is the 'real
deal' unless you have some background knowledge of what's going on or
have seen that the poster your reading has a history of giving
good/correct information. A guy named Paul comes to mind.

http://www.amd.com/us-en/Processors/ProductInformation/0,,30_
118,00.html

        Bill
Signature

GMail & Google Goobers.
This century's answer to AOL and WebTV.

Ed Light - 19 Sep 2008 01:14 GMT
One core of a Phenom will run a little faster than one core of an X2,
given the same clock speed.
Signature

Ed Light

Better World News TV Channel:
http://realnews.com

Bring the Troops Home:
http://bringthemhomenow.org
http://antiwar.com

Iraq Veterans Against the War:
http://ivaw.org
http://couragetoresist.org

Send spam to the FTC at
spam@uce.gov
Thanks, robots.

Dave - 19 Sep 2008 13:17 GMT
> One core of a Phenom will run a little faster than one core of an X2,
> given the same clock speed.

Thanks you SOO much.  That is great to know.

--Dave
Dave Feustel - 19 Sep 2008 14:24 GMT
The effective clock speed of a single core in a multiple core chip is
the chip clock speed divided by the number of cores, so an application
running on a single core in a multicore chip will run slower than the
same app running on a single core cpu. BUT in a multicore cpu the
app will experience fewer task switches for interrupts, etc because
there are other cores to run the interrupts, etc on. Since each
core has its own set of registers, less time is spent saving and
restoring register data, of which there is a lot on X64 cores.
So whether a single-threaded app runs faster or slower on a
multicore chip is a little hard to predict apriori.
Scott Lurndal - 19 Sep 2008 18:16 GMT
>The effective clock speed of a single core in a multiple core chip is
>the chip clock speed divided by the number of cores,

This is incorrect.

All cores run at the same clock speed, which is the 'chip clock speed'. Of course
the power-management capabilities of the processor allow the operating system to
individually ramp-down the voltages and frequencies of each core to allow them to
run slower (when idle), but the norm is for all cores to run at the same clock
speed which is equal to (not a fraction of) the core clock speed.

So called SMT (aka Hyperthreading) is different, in that the secondary thread is
leveraging otherwise idle execution and load/store resources on a single core.

scott
Zootal - 19 Sep 2008 19:39 GMT
> So called SMT (aka Hyperthreading) is different, in that the secondary
> thread is
> leveraging otherwise idle execution and load/store resources on a single
> core.
>
> scott

I don't get this - what can hyperthreading do that a good cpu scheduler
can't do? If I have two virtual cores, I have to have two schedulers running
(one for each virtual cpu), each with their own set of queues and each with
50% cpu time. Is that more efficient then one single scheduler that has 100%
cpu time?
Scott Lurndal - 19 Sep 2008 23:41 GMT
>> So called SMT (aka Hyperthreading) is different, in that the secondary
>> thread is
[quoted text clipped - 5 lines]
>I don't get this - what can hyperthreading do that a good cpu scheduler
>can't do?

Leverage otherwise idle resources in the core.   A core typically has
two or more integer ALU's and one or more floating point ALU's.  These
allow superscaler behaviour (i.e. multiple instructions can be in flight
at the same time (multiple issue)).  However, for many instruction streams, not all
of the ALU's and FPU's are used, so a second 'logical' processor (the
hyperthread) can be made available to the operating system to take advantage
of those idle resources.

Note that even with HT/SMT, the operating system sees them as two
distinct cores, even though they aren't really stand-alone cores.

A four physical core processor with SMT will appear to the
operating system as 8 logical cores.

> If I have two virtual cores, I have to have two schedulers running
>(one for each virtual cpu), each with their own set of queues and each with
>50% cpu time. Is that more efficient then one single scheduler that has 100%
>cpu time?

There is only one scheduler in a typical operating system.  It schedules
across all logical cores and is typically NUMA and SMT aware in order to
make optimal scheduling decisions.   NUMA awareness means scheduling
user threads/tasks on a CPU close to memory.  SMT aware schedulers understand
that resources are shared and attempt to schedule related threads (i.e.
threads from the same process/job/task) on the secondary threads.

scott
DevilsPGD - 20 Sep 2008 00:25 GMT
>I don't get this - what can hyperthreading do that a good cpu scheduler
>can't do? If I have two virtual cores, I have to have two schedulers running
>(one for each virtual cpu), each with their own set of queues and each with
>50% cpu time. Is that more efficient then one single scheduler that has 100%
>cpu time?

The problem that Hyperthreading was designed to solve is that the P4
series has an extremely long pipeline.

In other words, it takes many cycles to get instructions to the CPU, and
for the CPU to send instructions to pull data to/from memory or other
hardware components.

Hyperthreading was designed to help/encourage existing OSes to schedule
multiple threads/workloads so that the CPU can run them, from the OS'
point of view, concurrently, rather then waiting for one workload to
finish before sending another.
Zootal - 20 Sep 2008 01:40 GMT
> In other words, it takes many cycles to get instructions to the CPU, and
> for the CPU to send instructions to pull data to/from memory or other
> hardware components.

That isn't exactly correct - the long pipeline *is* the cpu, it just takes a
lot of cycles to make it through the pipeline. In order to get the
advertised clock speed, they had to make the pipeline longer. The P4
Prescott 3.8GHz has a 31 cycle pipeline.
Bill - 20 Sep 2008 00:40 GMT
> The effective clock speed of a single core in a multiple core chip is
> the chip clock speed divided by the number of cores,

Have you got a cite for that?

<snip>

        Bill
Signature

GMail & Google Goobers.
This century's answer to AOL and WebTV.

Dave Feustel - 21 Sep 2008 14:21 GMT
>> The effective clock speed of a single core in a multiple core chip is
>> the chip clock speed divided by the number of cores,
[quoted text clipped - 4 lines]
>
>                Bill

The person who told me this is Miles R***, a person who sells computers
for a living. If the cores ran at the chip's nominal clock speed, a
four-core chip would perform 4 times faster than a single core chip at
the same clock speed, which they don't. And the power consumption would
be much higher. So I think Miles is correct.
Miles Bader - 21 Sep 2008 14:35 GMT
> The person who told me this is Miles R***, a person who sells computers
> for a living. If the cores ran at the chip's nominal clock speed, a
> four-core chip would perform 4 times faster than a single core chip at
> the same clock speed, which they don't. And the power consumption would
> be much higher. So I think Miles is correct.

No, this is not correct.

Either you misinterpreted "Miles R***", or he is quite ignorant about
his own product (or both).

-Miles

Signature

Genealogy, n. An account of one's descent from an ancestor who did not
particularly care to trace his own.

Rodney Pont - 21 Sep 2008 15:13 GMT
>The person who told me this is Miles R***, a person who sells computers
>for a living. If the cores ran at the chip's nominal clock speed, a
>four-core chip would perform 4 times faster than a single core chip at
>the same clock speed, which they don't. And the power consumption would
>be much higher. So I think Miles is correct.

The four core chip can only run an application on all four cores if
it's threaded and at least 4 threads have work that can be run
simultaneously. Even in threaded applications this can't always happen
unless the threads are doing something that doesn't depend on others,
say converting a video file where each core can be given a section of
the file to convert.

I can see how he came to the conclusion though if he ran a single
threaded application and it ran four times slower than expected, since
it ran on only one core. Get him to run four of them at the same time
and they should complete in nearly the same time as one providing he
isn't running anything else at that time.

As for power consumption my dual core chip uses 45 watts and the quad
core version uses 95 watts. Taking into account the extra circuitry for
the 4 cores it's about right.

Signature

Regards - Rodney Pont
The from address exists but is mostly dumped,
please send any emails to the address below
e-mail    ngpsm4 (at) infohitsystems (dot) ltd (dot) uk

Jim Beard - 21 Sep 2008 17:45 GMT
>> The person who told me this is Miles R***, a person who sells computers
>> for a living. If the cores ran at the chip's nominal clock speed, a
[quoted text clipped - 18 lines]
> core version uses 95 watts. Taking into account the extra circuitry for
> the 4 cores it's about right.

One must also bear in mind that a dual-core or quad-core CPU has to
devote some processing time to deciding what to run on which core,
when.  This more intricate scheduling task routinely results in a
process running on only one core running more slowly (all things
included) than a process running on a single-core CPU that is lightly
loaded.

Whatever the CPU speed is in GHz or MHz, all cores will work at that
speed unless power management software readjusts the speed.  That
does not mean that all that speed is usable, though.  You still have
delays due to I/O requirements, scheduling delays, wait states, and a
host of other bottlenecks, real and potential.  My home computer is
an AMD 64-bit 5000+ dual-core, and CPU usage typically is in the 1 to
3 percent range when I am not compiling or doing some other
CPU-intensive task.  This does not mean that all tasks complete
instaneously nor that response time is zero (though it is very nice,
I will admit).

Specifically with respect to X2 vs X4, the kernel scheduler will do a
fairly good job of using two CPUs, but rarely does well with more
than two unless the applications are specifically tailored for
multi-CPU usage.  Thus, the percentage gain in performance from
shifting from single to dual-core cpu is likely to be significantly
greater than the percentage gain from shifting from dual-core to
quad-core, unless you have software tailored for the additional cores.

The big question, of course, is, are your applications CPU-intensive
enough to make use of the available capacity, regardless of number of
cores?  If the computer is not heavily loaded at least part of the
time, the answer is likely to be no.

Cheers!

jim b.

Signature

UNIX is not user unfriendly; it merely
     expects users to be computer-friendly.

Scott Lurndal - 21 Sep 2008 20:56 GMT
>Specifically with respect to X2 vs X4, the kernel scheduler will do a
>fairly good job of using two CPUs, but rarely does well with more
>than two unless the applications are specifically tailored for

maybe with respect to windows, but linux schedulers are O(1) over
large numbers of cores.

scheduler overhead is pretty much non-existent.

scott
Zootal - 24 Sep 2008 22:06 GMT
>>Specifically with respect to X2 vs X4, the kernel scheduler will do a
>>fairly good job of using two CPUs, but rarely does well with more
[quoted text clipped - 6 lines]
>
> scott

Are you sure about that? Each cpu has its own set of runqueues. If I have 4
cpus, I have 4 sets of runqueues to manage, and 4 sets of runqueues to
search. The runqueue itself can be searched for the next entry in O(1)
time - this is where the O(1) comes from, because the amount of time it
takes to find the next task in the queue is constant and not dependant by
the number of tasks in the queue.

I would think that that the default linux scheduler is O(n) over large
number of cores, where n = the number of cores.
Scott Lurndal - 24 Sep 2008 23:12 GMT
>>>Specifically with respect to X2 vs X4, the kernel scheduler will do a
>>>fairly good job of using two CPUs, but rarely does well with more
[quoted text clipped - 16 lines]
>I would think that that the default linux scheduler is O(n) over large
>number of cores, where n = the number of cores.

If you have a runqueue per core, then you simply schedule the next
entry in the queue for each core.  O(1).    Remember that code is shared by all
processors, and scheduling happens in-context - there is not a
scheduler "thread" or "job" or "task" per se.

scott
symonds - 02 Oct 2008 12:36 GMT
Quality of service enforcement - identifying different types or classe
of packets and providing preferential treatment for some types o
classes of packet at the expense of other types or classes of packet.

'Overcoming Fear of Flying
(http://www.gogetterjetsetter.com/overcoming-fear-of-flying.php)

'stingray boots' (http://www.timsboots.com

--
symonds
Bill - 22 Sep 2008 00:17 GMT
> >> The effective clock speed of a single core in a multiple core chip is
> >> the chip clock speed divided by the number of cores,
[quoted text clipped - 10 lines]
> the same clock speed, which they don't. And the power consumption would
> be much higher. So I think Miles is correct.

You're entitled to your opinion, but as far as "The effective clock
speed of a single core in a multiple core chip is the chip clock
speed divided by the number of cores" is concerned Miles R*** is full
of sh.t*, and you can tell him I said so.

You need to get to Intel's or AMD's website and do some reading.

        Bill
Signature

GMail & Google Goobers.
This century's answer to AOL and WebTV.

Richard P - 22 Sep 2008 00:28 GMT
>>>> The effective clock speed of a single core in a multiple core chip is
>>>> the chip clock speed divided by the number of cores,
[quoted text clipped - 17 lines]
>
>         Bill
I have a X4 and each core at default is 2.5ghz.
DevilsPGD - 23 Sep 2008 07:14 GMT
>The person who told me this is Miles R***, a person who sells computers
>for a living.

"Never trust someone trying to sell you something" comes to mind.

>If the cores ran at the chip's nominal clock speed, a
>four-core chip would perform 4 times faster than a single core chip at
>the same clock speed, which they don't.

Depending on your task, a four-core CPU can perform reasonably close to
four times the clock speed of a single core CPU.  Unfortunately, few
tasks parrallelize that well, and even less software takes full
advantage of modern CPUs.

That being said, aside from some shady marketing in the past advertising
dual CPU systems as double the clock speed of one CPU rather then
advertising the actual configuration, each core runs at the full clock
speed advertised.
Dave Feustel - 23 Sep 2008 13:44 GMT
>>The person who told me this is Miles R***, a person who sells computers
>>for a living.
[quoted text clipped - 14 lines]
> advertising the actual configuration, each core runs at the full clock
>> speed advertised.

So the 4 core chip cpu should run 4 independent identical tasks (compute
pi to 1 million digits) in essentially the same time that a single core
runs one instance of that task?
Richard P - 23 Sep 2008 18:07 GMT
>>> The person who told me this is Miles R***, a person who sells computers
>>> for a living.
[quoted text clipped - 16 lines]
> pi to 1 million digits) in essentially the same time that a single core
> runs one instance of that task?

Yes
DevilsPGD - 23 Sep 2008 20:39 GMT
>> Depending on your task, a four-core CPU can perform reasonably close to
>> four times the clock speed of a single core CPU.  Unfortunately, few
[quoted text clipped - 9 lines]
>pi to 1 million digits) in essentially the same time that a single core
>runs one instance of that task?

More or less, yes.  However, in the real world, not all tasks will scale
quite this well as many tasks require not only CPU resources, but also
other resources which may become starved before you load all four cores.

For something that can be done entirely on-chip, you'll get four times
the performance using all four cores of a quad 2.4GHz CPU then a single
core version of the same 2.4GHz CPU.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.