Wednesday, 19 June 2013

The speed and size of a hotrod PC's in 2012

I have been thinking about the speed and size of my favourite computer as it crunched it's way through a combinatorial monster of a problem. Looking at the differences between the parts of a PC can really help to exploit the performance characteristics of the machine.

The system in question BB02, previously seen in Gannetts Folding hall of Fame, is quite a beefy box that was assembled at a cost of about £2000 and has recently had some memory and storage upgrades.   


Starting off with memory and Core i7 Gulftown processor we can see in the top left of this screenshot, taken from Memtest, that the processor has three levels of cache and 16GB of memory.  The cache levels decrease in speed as they increase in size.


Memtest results


The speed quoted for the memory at 9600 MB/s sits a bit under the max speed shown in the reference table below that shows DDR3-13333 at 10,666 MB/Sec. Not having all the same memory modules in all 6 slots may account for part of this 10% discrepancy. Thanks to http://www.hardwaresecrets.com for the data.

Memory speed table

The motherboard, an Asus PT6 Delux V2 has 6 data ports that have been loaded with, an older 5400 RPM drive, a 7200 RPM drive,  a Corsair Force 3 SolidSateDrive and three new Seagate 3TB 7200 RPM drives.  The Seagate drives have been configured as a RAID 0 stripe group using the Ubuntu MultiDevice technology. Using a stripe group shares the I/O load across all the disks in the group. The tests were performed across 2 and 3 drives.  
Asus P6T Delux V2

The evolved and well known filesystem & disk speed utility Bonnie++ was used to access the disk performance. Bonnie tests through the filesystem layer, rather than exercising the raw disk, giving better real life performance numbers. 

Here are the consolidated results converted to MBytes for the sizes and MBytes/second for the transfer speeds. 


We can see in the table, for most levels, that as speeds go down as the size goes up. The impact of using faster drives in a RAID configuration results in at least 3 times speed up over the older single drives.


Plotting the results on a chart also gives some insight into the numbers.  Having memory as the crossover point we see the step down between CPU and storage speeds and relative sizes. The vertical axis is on a Log scale showing the progression from the 32K cache size to the 9Terra Bytes of the 3 way stripe storage.

Sizes and speeds of processor caches, memory and various storage.


Looking at the sizes of the various elements we can see that the variation in size between largest (3 way stripe) and smallest L1 Cache is much bigger than the difference in speed scale.  Interestingly the difference between main memory speed and the slowest disk is about the same order as between a cheetah and a tortoise.

Other comparisons are:
Max Differential speed scale2472
Max Differential size scale281250000.00
Diff Mem/L2 cache6.82
Diff Mem/ Fastest Disk49.66
Diff Mem/ Slowest Disk362.26
Diff Speed  Cheetha/ Tortoise411.76
Diff Size Stamp /football pitch5000000.00

Also included in the main table above are the seek operations per second numbers for the storage drives that clearly show the distinct advantage of SSDs technology in a read situation. The ability of SSDs to maintain full transfer speeds in a read (and random read) situation make them particularly useful as database index and system drives. 



Corsair SSD



Windows 7 has a performance comparison utility built into the Control panel that scores elements of a system between 1 and 7.9. The system under consideration scores a respectable 7.8 on all elements except 7.6 on Disk Drive speed.



For real cpu work Folding at home gives a system a real work out. For this system over 36 Gflops are delivered. 36 GFlops is more than a Cray T932.

Writing final coordinates.
[02:51:00] Completed 500000 out of 500000 steps  (100%)

 Average load imbalance: 0.7 %
 Part of the total run time spent waiting due to load imbalance: 0.4 %
 Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 1 % Y 0 %


Parallel run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time:  17278.673  17278.673    100.0
                       4h47:58
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:    711.518     36.599     10.001      2.400 


When looking at network speeds be sure to register the distinction between MB/s (MegaBytes) and Mb/s (Mega bits)/s. Long distance lines and telcos will often quote as Mbits/s but payloads are usually measured in MBytes. A high res photo is about 6Mbytes so would take either 6 seconds on a 1 MByte/s line but would take 40 s on a 1Mbit line.

In the house there is Gigabit wired networking and Wireless infrastructure.

Wireless - 8.9 MB/s


Wired ( Gigabit) 70 MB/s


The numbers above are obtained using a simple shared folder drag and drop file move. Whilst a good indication of real world performance the wired number is missleading because the file transfer speed is capped at the disk speeds of the PCs involved.

Using the iperf network test utility that just sends network data between the same two machines we see on the wired network gets over 900 Mbits/s ( 113 MByte/s).


------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.11 port 53619
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.05 GBytes    906 Mbits/sec
[  5] local 192.168.1.2 port 5001 connected with 192.168.1.11 port 53620
[  5]  0.0-40.0 sec  4.28 GBytes    920 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.11 port 53623
[  4]  0.0-40.0 sec  4.27 GBytes    916 Mbits/sec

and the wireless route gets up to about 100Mbits ( 12MBytes/s) when forcing a large packet size.

$ iperf_Intel -c p.local -w 256K
------------------------------------------------------------
Client connecting to pup02.local, TCP port 5001
TCP window size:   257 KByte (WARNING: requested   256 KByte)
------------------------------------------------------------
[  5] local 192.168.1.70 port 63886 connected with 192.168.1.68 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec    122 MBytes    102 Mbits/sec

Out to the Internet we have a good service from BT Infinity giving 64Mbits/s ( 8MBytes/s) download and 2 MBytes upload when testing with the well respected Speedtest.net That is about the same as the internal wireless connection.  Using a wired connection about 70Mbits/s is seen.

















No comments: