How many cores sandy bridge




















Intel may make changes to manufacturing life cycle, specifications, and product descriptions at any time, without notice. The information herein is provided "as-is" and Intel does not make any representations or warranties whatsoever regarding accuracy of the information, nor on the product features, availability, functionality, or compatibility of the products listed.

Please contact system vendor for more information on specific products or systems. Skip To Main Content. Safari Chrome Edge Firefox. Products Home Product Specifications Processors. Search examples You can search our catalog of processors, chipsets, kits, SSDs, server products and more in several ways.

Find processors. By blindly scaling the area, i7 would gain an extra 1. Since BD is likely to be more competitive in terms of performance, it will most likely be priced also higher than the current stuff. Good decision. I am holding out to see what Bulldozer brings to the table.

Look at the article. Damn this just got expensive. Oh well. Only sad face about SB ive seen is that quicksync or whatever the video transcoder thing is called only works when your running on the IGP. If you got a discrete in your system then youll have to stay with standard x86 encoding.

Thats annoying since that was one of the big selling points for me when looking at the speedups it offered. It is said in this thread that one SB core is 30mm2. I don t know where your 20mm2 come from. Think about that for a second. AMD with 1. At what cost would this come? Well there we go. By comparison Sandy Bridge is far more tightly integrated than the bulldozer die. Mind you the shaders of today are not quite comparable to those of either in number or speed or especially capabilities.

Not true. However, most AMD has re-introduced their catalyst level transcoder and ported into Stream which encodes via the programmable shader units inside the GPUs. The FSB was architected with the Pentium Pro, and major attention was made to implement a cache coherent protocol for multisocket as well as serve as the bus to the NB.

NB simply reused this bus from the P6 generation with little or no really changes perhaps some pipelining depth or something, not completely remembering many details really. I do agree, though not entirely certain, that the branch predictor key components were probably borrowed heavily from Netburst.

Netburst was a 3-issue design, going wider was simply a design choice specific to Conroe. Macro-op fusion was also specific to Conroe, micro op fusion was in place in Banias as I recall and as do you.

Frankly, macro-op fusion was not well implemented in Conroe in my opinion, it only really fused two complimentary instructions and from the resulting length, did not work in bit mode. Macro-op, in my opinion, was only properly done in Nehalem. Hans has a pretty good and established record of estimating die sizes from obscure die shots. BD is very cache heavy, so just estimating on core alone seems to be a bit off.

There is 8 meg of L3 cache as well, and the northbridge which will probably be well overhauled. The sooner people get hammered by the usual intel hype, the better, as inertia will help when the BD base ball bate will strike……. Hans is a respected author, but on this one, i think he got it wrong. Beside, that shot was publicly known as being heavily photoshoped before publication, so my point of view is as valuable as Hans one….

From what I can see on the link when hyperthreading is enabled they compare the K 3. It simply that OEM and manufacturers already have BD samples and that its performances are already known in restricted circles, including of course Intel.. I am curious about what context they would make that statement. But nothing following that explains why you became less impressed with the CPU, it reads more like you are upset that Intel would pull in a launch and lower prices. Rather than eyeball it, someone has taken the data and compiled a core for core, clock for clock turbo off, SMT off and both on variants comparison:.

Kinda puts you up a tree if you migrate big drives over. That said, motherboards are getting there, feature-wise. Intel elects to put the best integrated graphics cores HD only on the K-series chips. But K-series chips will undoubtedly be purchased by enthusiasts who will run discrete cards. Why not put the best graphics on the lower-end chips that will undoubtedly be used in the vast majority of mainstream OEM systems?

Of course, you could buy an H67 motherboard to pair with your K-series processor, to take advantage of the integrated graphics, but then you lose the ability to overclock your K-series processor, which is ostensibly the primary reason you bought a K-series processor in the first place.

Supposedly all these lame issues will be corrected by an upcoming chipset named Z The thought of Sandy Bridge being such a failure that the battle this summer — and for the back to school season — became Lynnfield vs Bulldozer, well, if I was a beancounter at Intel, that thought would scare the crap out of me. But a non-negligible departure would be a risk. The improved power consumption mostly comes from the die shrink vs. I think a 32nm Lynnfield would have recieved this praise.

AMD is only just matching the Conroe core for per clock performance. Short memory? A guy asked why Intel prompted the launch two days earlier. It simply that OEM and manufacturers already have BD samples and that its performances are already known in restricted circles, including of course Intel…. The first i7 seemed to push the bar a lot higher relatively speaking than this. Intel has gotten much more savvy over the years since AMD thrashed it royally with the k7 and up, thrashed them pretty good right up until Core 2 hit the bricks, when the thrashing proceeded to operate in reverse…;.

I think the point is that in the old days Intel would have priced SB wherever it wanted to—into the stratosphere, probably—and never have worried at all about the Total Cost of Upgrade to SB, and assumed that most people would migrate to SB without much regard for the TCU.

This is completely out of character for Intel—when was the last time you recall Intel releasing a brand-new, very high-performance cpu that outperformed the previous generation of its cpus by a far margin in many cases— [i True, but it seems nowadays that even lowly mATX boards have more ports and what not than most people are going to use. Like dual ethernet… really? Enthusiast motherboards are being used as servers or firewalls?

I was definately in the category of build a full tower system with a full size ATX board until I realized all those slots and drive bays are never, ever going to be used. As for the X6 perf in Cinebench SB has the upper hand because of the process node, but if the P2 was to be shrinked to 32 nm, it would be as efficient as Intel s best offering if not more, as a 8C P2 would easily battle it, even in term of power efficency. You re completely wrong.

Why should AMD uncore be significantly bigger at the same node? All of that X8 theory is nice and all but… what does that have to do with your claim that Cinebench That will be the size of exactly that. None of the critical logic that is required to use the modules will be included. What about the memory controller? What about cache control, hyper transport, power gating and all the interfaces each core requires to communicate with other cores?

Wow, such an inspired response. Well, now both of us have seen everything. Intel s superiority is mainly the 32nm process. The way I see it, the day LGA came out, going other than for people with enough cash to buy the extreme high end made no sense. A module is 31mm2 with 2MB cache included, add 52mm2 for the 8MB L3 cache, that makes mm2, the remaining 24mm2 are more than enough for the rest of the uncore.

But not ssk. We need a double thumbs up for him. As someone who was going to get duped into buying into the LGA platform, you were going to overpay for the motherboard anyhow. I want to thank TR for including the Core i7 in the benchmarks. This was an excellent comparison and I think I will be upgrading to the ik instead of the i What DRM controversy? At this point what I am reading is that II is going to be on big box systems like those you can buy from Best Buy?

The general TR populace do not buy from BB anyways, no? You never achieve perfect scaling when moving an old arch to a new process. You are very ambitious, plus your numbers you were throwing around were beyond what double scaling could even achieve. I think you need to sit down with a sheet of paper and a calculator and figure out what your story is.

Did you look at the Cinebench Although I get what you mean. It looks like a decent CPU to me, at a decent price. The on-board graphics are irrelevant to any gamer however, making the HD graphics in the K series pointless, and doubly-so that the hardware accelerated video transcode is not available with discrete graphics attached. A 45 to 32 nm shrink double the density, so that put the X6 at mm2 at most, and no more than for the P2 X4.

How do you figure that? Indeed, using 32nm, current X4 or X6 would be quite competitive. The typical Krogoth post is usually some kind of disparaging remark with respect to whatever the subject of the article is.

Most of the users of Intel integrated hardware are going to do a lot more video than gaming. Take a low end notebook for surfing the internet and accelerate all the flash videos on a low powered decoder and you will save a lot of battery life. Yeah they will be like they are currently.

Selling huge die monsters for nothing. This brings problems itself though as it can lead to non-deterministic results depending on what code the compiler generates. The next step will be bit. Who in General Purpose computing really needs that precision though? Good point how can intel not produce a integrated solution today to compete with a integrated solution from 5 years ago.

Scott, how do you explain the low score for Athlon II X3 in the 7-zip benchmark? Do you think that it is using only 2 cores? Because its score is much lower than Phenom II X4 Thank you. AMD will have an adequate couterpart as they will have better offering at the top and bottom.. You do understand that in this techpowerup test the games were first tested with a geforce right? SB has a uop cache, where uops are stored during the decode phase, this is similar to a trace cache Netburst where as Conroe to Nehalem did not.

Conroe, even up through Nehalem, could be traced back as kin to Banias, Dothan, Yohan of which were the brethren of P6 which was divergent to Netburst specifically to go after lower power mobile in their time. Can someone boil it down for me? Is it anything to be concerned about and would it prevent any of you from purchasing one of these parts? Edit: Disregard.

False alarm. Fixed function, naturally, lacks the flexibility to adapt to new video codecs as they become available. In short, GPUs will still be the best option for workstation level video transcoding via prosumer and professional level software, where as SB is really nice for consumer oriented transcoding utlities since the that market has settled in on a standard much like JPG for images.

The answer to your quesiton is in some of the slide decks that have been published over the past few months… decode and encode are dedicated outside of the EUs. That is why the smart buyers will always skip a generation or 2 or even 3. To put it simply, within their price brackets, they are a tactical nuke. No two ways around that. Keep in mind that Anand also noted that modern GPUs also use a fixed function hardware decoders, rather than using the shaders to decode.

Since the app in use is still pre-release, I would not say for sure at this point that using the CPU hardware encoder is going to suck in terms of quality. Intel board is not usually known for their overclocking features. I would wait for proper reviews from the usual hitters before passing judgement about the H How can you judge gaming performance when the reviewer uses a ridiculous choice of GPU i. The lower power consumption is very nice, but that mainly benefits smaller form factors.

The most impressive part of Sandy Bridge is the integrated graphics. It rivals the current budget discrete solutions which is good enough for mainstream gamers. It looks like it will succeed in that regard. The upcoming Fusion chips will have entirely new sockets for them. Whoever came up with that idea must have been on Special K. I am not sure that the 1 less stick of ram argument holds as you do not have to use three sticks in a board if you choose not to.

However, it is handy having the additional ram for multitasking. I am just dissappointed by the gaming performance increase given that this is a brand new architecture.

That was a [i You will have to read the architecture preview article linked from this review because Damage has said he would not delve into that discussion in the context of that review. Conroe brought us 4-issue wide pipelines, memory disambiguation and a new focus for desktop processor anyways to performance per watt. Nehalem brought us the IMC and modular building block approach. This time, we have the new ring bus, on-chip GPU, and new a lot of things I remember Damage said everything from the branch predictor to most other aspects of the processor has been changed.

I wonder how they will handle Llano then? Do you think that will generate a new socket? That is ultimately the chip that will fit up against what was shown today.

However, that is moot… it is clear, AMD does a much better job at synergy and extending socket lifetime, but expect new sockets from AMD in the future as well. Only with a massive overclock to 4. The top end sandy bridge will release later this year. The i7 is a consumer, high volume part. For a quad core chip to get close to the X 6 core is pretty remarkable, and speaks to some of the architectural improvements. You find it odd, I also find it odd that someone would complain about getting X like performance in the price bracket.

I am confused about the conclusion of the article. Scott you state that Sandy Bridge is a new Intel processor architecture. Also how is it totally new vs Conroe and Nehalem. I thought SB was just a heavily tweaked Nehalem. Their weakness is when you look at total system cost. Usually you can find cheaper mobos on AMD side, and the sockets tend to last longer as well long enough that you may actually get an upgrade out of them.

If someone is on a budget, they likely will be in the future too, so that upgrade option is nice to have. The was not tested in this review, and the was consistently beaten by the K.

So I ask again, what are you smoking? They will be on those BestBuy boxes but who knows what chipset they will use. Choice is good, right? As the market leader, Intel is betting that given the option, customers will decide to purchase a new Intel motherboard instead.

AMD maintains as much compatibility across sockets as possible so that customers factor in the cost of a new mainboard when considering a switch to Intel. In short, their behavior is determined by their respective market positions. And just as with those, the only way it will be active is if you purchase a motherboard using the appropriate chipset and enable it in the BIOS. These new vector stuff look nice though and will enable more speedups in certain routines. The days of blanket improvement with a bump in GHz is over, so they have to do tricks now.

Stop paying so much attention to it then. Anyone who wants to read what you have to say will do so, regardless of whatever rating your post gets. It started weeping and asked me to bring it chocolates. The fact that AMD offers pretty competative Multi core preformance still at their current price speaks to their strengths. Intel could cut all their prices more aggressively to have a outright victory on all fronts but they have never been that aggressive. At some point the integration of more components onto the CPU will necessitate new signal pins, or re-arrangement of some.

Since you cannot get easy performance gains by just bumping the clock or adding more cores, the 2 CPU guys are now looking at overall platform stuff to eke out any gains they can. Leave my hat to TR for including Cinebench Ho, yes, he did use it in its first preview, but he removed the X6…… This time, he didn t use it, but he mention it in his current article to say that P2 X4 is behind SB, yet, no graph… Hey, i thought that SB was a processor for modern softs…….

No need for this. Some cool improvements though. Holy crap, bad memory. Err, my memory was bad. I think I might be a combined thumbs down champ over here although grantmeaname up there is making a race out of it … even my reasonable posts get thumbed down with vengeance. How was I to know? I mean, what would it take for Intel to impress you, ship you a CPU and a cake with a stripper in it?

The performance potential of Llano is down to two relatively well known quantities and has had much educated guesswork and speculation devoted to it already. The key factor for Llano is that it is going to be wholly reliant on software optimizations to make use of any synergies between the GPU and CPU.

AMD will still continue to compete and survive in the marketplace, but anyone expecting a KO like they did with the K8 vs P4 is probably going to be disappointed. Yeah, I know Mister No-sense-of-humor. The server could handle another 4kb for a decent jpeg. Man and I was trying to see what the fuss is all about, and the SA site goes down again!

What kind of narrow window of uptime that site has? What a baffling feature when GPUs are getting more and more flexible. I dub thee, MMX2. Probably not. The socket needs to change. The days of socket compatibility between new and old generation CPUs is long gone.

Unless, you are doing some hardcore content creation, number crunching or simply want to cut down on power consumption without killing performance. It incorperates their strengths and overcomes their shortcomings. It further improved on those strengths. That is why Conroe was such a monster at launch. It destroyed its Netburst predecessors and stomped its K8-based rivals. But spending more to get less? The motherboard is likely to be ish more as well. I doubt that GTX would makes things any better.

Sandy Bridge is getting too many accolades. It all started with Nehalem. Calling it the biggest architectural change since Netburst is a bit too much. It is really just an extremely evolved Conroe. IMO, the last architectural change from Intel was Conroe. Ivy Bridge is looking like the next generation. It is fast? Of course.

Does it do it with better power efficiency? Is it enough of a gap to be worth upgrading over the previous generation? Depends on your needs. It yields enough performance for mainstream gamers. Anyway, the upcoming Haswells will be more impressive.

I expect it to rock at rendering and number crunching. It looks like the fears of K series commanding a hefy premium were unfound. It is just minor tax that overclockers can easily stomach. I feel the same way with an E 4ghz. I do appreciate the scientific, synthetic, 3d rendering, etc tests just to see what the CPUs are really capable of.

Not really with or without compatability with the intergrated graphics is replaced with again with two options while will be replaced with I just wanted to note with some concern that the description of the more aggressive turbo boost implies that it may be boosting these benchmark if they are being done after a period of idling in which the computer was able to cool off. I also suspect that if the temperature is being monitored to control the speed an open air test bench with good cooling could also be boosting the results.

It would be interesting to see if you guys can define a method of showing the effects of turbo boost. Perhaps heat the thing up running prime95 and then jump into a benchmark or use a weaker cooling setup and see if the numbers change. I would be interested to see if these kinds of things make a difference.

That being said the performance even if it is perhaps slightly boosted from what one might get in the real world is very good. It was [url Would like to see some results of clock-per-clock comparisons of Sandy Bridge and Lynnfield with turbo turned off and processors at the same frequencies.

I realize that will be a purely academic exercise from a consumer point of view but still will give a lot of insight into the architecture. That ploy was a little too obvious when disconnecting or jumping a few pins together makes new CPU work magically fine in old motherboard as it did for me , so now they just release a new socket every 6 months. This of course, probably eats into the transistor budget, but Intel with their large advantage in manufacturing would scarcely notice.

Intel is a business, and they are motivated by profits. AMD is going to go further to make the sale, whether it nets them a sale of a new accompanying chipset or not. See, I have to disagree with that a bit. I wonder if AMD will continue their differing approach, the x6 core processor for under was a very good answer to intels CPU leading up to sandy bridge.

I wonder if AMD will push 8 or more cores for under bucks this coming year? If maintaining socket compatibility would in any way hamper the advancements Intel can make to the CPU, then I see it as a losing proposition.

My opinion is that a Yorkfield or Deneb or better is not worth upgrading from unless you are doing sick amounts of video encoding. Very impressive. I definitely want one, but, that poor thing would just sit largely idle in my machine that is only really used for gaming. Ubuntu Plug the same stick into a USB2 port, and it works fine. We await appropriate drivers from Intel for re-testing, but as of press time, none were available.

Charlie has a habit of ranting over tiny details, lack of proper Linux drivers is not necessarily tiny…. My current setup is still plenty fast enough for my needs Phenom X4. But yeah, wow. Stupid fast. Excessively fast. But for content creation, an upgrade might be worth considering.

I do not agree with thumbing down comments. Can you spot the difference? As you say most games are not CPU bound. It is clear that a 4GHz i7 would utterly destroy an overclocked K at 4.

There are no tradeoffs to be made. This is a huge contrast to the P4, which involved big tradeoffs relative to the P3. I suspect that Bulldozer will take a different approach, bringing more targeted performance increases that will be of great interest to specific markets.

Intel is still pushing the idea of the general purpose processor that is used for everything from laptops to server farms and SB is a pinnacle achievement for general purpose processors. My own view is that AMD et al probably are more in tune with where computing needs are heading. I feel more and more people are going to agree with you.

Also, the 10 fps increase is at low rez, higher rez the cards make it even more minimal. Looking good. I would have expected some kind of cluture from you gentlemen. Which, at its stock clock of 3. For the professional renderer or media encoder, 6 real cores and more system interconnect is better than 4 cores and a crippled southbridge, but the hardcore gamer and overclocker are not well served by Gulftown or Bloomfield in place of SB.

Awesome article folks. We all appreciate your hard work. The power consumption improvements vs. My reference point throughout this article was the i which I own. It got trounced quite severely on quite a few occasions. If I am buying new today why would I spend similar money on the i but need to suck up additional costs in at least 3 sticks of RAM and a more expensive motherboard? For people looking to upgrade from the 9xx, you are absolutely right.

No need to look for upgrades, not that their options are plentiful to begin with ? No reason for this to be downrated, so I got you back to 0.

Not only would such mechanism increase complexity, but it's also unclear how much, if any, benefits would be gained by that. But that's where the similarities end. It's worth noting that the trace cache was costly and complicated having dedicated components such as a trace BTB unit and had various side-effects such as needing to flush on context switches. This implies a significant storage efficiency of four-fold or greater.

The Allocation Queue acts as the interface between the front-end in-order and the back-end out-of-order. The Allocation Queue in Sandy Bridge has not changed from Nehalem which is still entries per thread.

The IDQ does a number of additional optimizations as it queues instructions. Streaming continues indefinitely until reaching a branch mis-prediction. The LSD is particularly excellent in for many common algorithms that are found in many programs e. The LSD is a very primitive but efficient power saving mechanism because while the LSD is active, the rest of the front-end is effectively disabled - including both the decoders and the micro-op cache.

The back-end or execution engine of Sandy Bridge deals with the execution of out-of-order operations. Sandy Bridge back-end is a clear a happy merger of both NetBurst and P6. The implementation itself, however, is quite different. Sandy Bridge borrows the tracking and renaming architecture of NetBurst which is far more efficient.

Sandy Bridge uses the tracking technique found in NetBurst which uses a rename which is based on physical register file PRF. Sandy Bridge returned to a PRF, meaning all of the data is now stored in the PRF with a separate component dedicated for the various meta data such as status information. It's worth pointing out that since Sandy Bridge introduced AVX which extends register to bit, moving to a PRF-based renaming architecture would have more than likely been a hard requirement as the amount of added complexity would've negatively impacted the entire design.

Unlike a RRF, retirement is considerably simpler, requiring a simple mapping change between the architectural registers and the PRF, eliminating any actual data transfers - something that would've undoubtedly worsen with the new bit AVX extension. An additional component, the Register Alias Tables RAT , is used to maintain the mapping of logical registers to physical registers. This includes both architectural state and most recent speculated state. This entry is used to track the correct execution order and statuses.

It is at this stage that architectural registers are mapped onto the underlying physical registers. Other additional bookkeeping tasks are also done at this point such as allocating resources for stores, loads, and determining all possible scheduler ports. Register renaming is also controlled by the Register Alias Table RAT which is used to mark where the data we depend on is coming from after that value, too, came from an instruction that has previously been renamed.

Sandy Bridge's move to a PRF-based renaming has a fairly substantial impact on power too. With the new instruction set extension which allows for bit operations, a retirement would've meant large amount of bit values have to be needlessly moved to the Retirement Register File each time. This is entirely eliminated in Sandy Bridge. Since Sandy Bridge performs speculative execution , it can speculate incorrectly. When this happens, the architectural state is invalidated and as such needs to be rolled back to the last known valid state.

Sandy Bridge introduced a number of new optimizations it performs prior to entering the out-of-order and renaming part. Two of those optimizations are Zeroing Idioms , and Ones Idioms. The first common optimization performed in Sandy Bridge is Zeroing Idioms elimination or a dependency breaking idiom. A number of common zeroing idioms are recognized and consequently eliminated.

Eliminated zeroing idioms are zero latency and are entirely removed from the pipeline i. The ones idioms is another dependency breaking idiom that can be optimized. In all the various PCMPEQx instructions that perform packed comparison the same register with itself always set all bits to one. Sandy Bridge features a very large unified scheduler that is dynamically shared between the two threads.

The scheduler is exactly one and half times bigger than the reservation station found in Nehalem a total of 54 entries. The various internal reordering buffers have been significantly increased as well. Sandy Bridge has two distinct physical register files PRF. It's worth pointing out that prior to Sandy Bridge, code that relied on constant register reading was bottlenecked by a limitation in the register file which was limited to three reads.

This restriction has been eliminated in Sandy Bridge. Sandy Bridge, like Nehalem has six ports. Port 0, 1, and 5 are used for executing computational operations. The Integer stack handles bit general-purpose integer operations. Each cluster operates within its own domain.

Domains help refuse power for less frequently used domains and to simplify the routing networks. Data flowing within its domain e. Data flowing between two separate domains e. Sandy Bridge ports 2, 3, and 4, are used for executing memory related operations such as loads and stores. Those ports all operate within the Integer stack due to integer latency sensitivity. In order to have a noticeable effect on performance, Intel opted to not to repeat their initial SSE implementation choices.

Instead, Intel doubled the width of all the associated executed units. This includes a full hardware bit floating point multiply, add, and shuffle - all having a single-cycle latency. Widening the entire pipeline is a fairly complex undertaking.

The challenge with introducing a new bit extension is being able to double the output of one of the stacks while remaining invisible to the others. Intel solved this problem by cleverly dual-purposing the two existing bit stacks during AVX operations to move full bit values. For example a bit floating point add operation would use the Integer SIMD domain for the lower bit half and the FP domain for the upper bit half to form the entire bit value.

The re-using of existing datapaths results in a fairly substantial die area and power saving. Overall, Sandy Bridge is further enhanced with rebalanced ports.

For example, the various string operations have been moved to port 0. A second has been augmented with LEA execution capabilities. Ports 0 and 1 gained additional capabilities such as integer blends. The various security enhancements that were added in Westmere have also been improved. Retirement happens in-order and releases any used resources such as those used to keep track in the reorder buffer.

One of the most sought-after improvements was increasing the load bandwidth. With inadequate load bandwidth, computational code specifically the new AVX will effectively starve.

Back in Nehalem , there were three ports for memory with two ports for address generation units. In particular, Port 2 was dedicated for data loads and Port 3 was dedicated for stores. Intel treats where you want to store the data and what you want to store as two distinct operations.

Port 3 was used for the address calculation whereas Port 4 was used for the actual data. In Sandy Bridge, Intel made both ports symmetric; that is, both Port 2 and Port 3 are AGUs that may be used for load and stores, effectively doubling the load bandwidth. It is physically tagged, virtuallly index, and uses 64 B lines along with a write-back policy. This was addressed in Sandy Bridge with four entries for 1 GiB page. In addition to making the two AGUs more flexible, many of the buffers were enlarged.

Likewise, the store buffer has been slightly increased from 32 entries to Both buffers are partitioned between the two threads. Altogether, Sandy Bridge can have simultaneous memory operations in-flight. As with the L1D, the L2 is organized the same as Nehalem. It's KiB, 8-way set associative, and is non-inclusive of the L1; that is, L1 may or may not be found in the L2. The L2 uses a write-back policy and has a cycle load-to-use latency, which is slightly worse than the 10 cycles in Nehalem.

From the CBox i. Configurability was a major design goal for Sandy Bridge and is something that Intel spent considerable effort on. With a highly-configurable design, using the same macro cells , Intel can meet the different market segment requirements. A copy of the paper can be found here.

The Sandy Bridge floorplan , power planes , and choppability axes are shown on the right in a diagram from Intel. The design is modular, allowing for two of the cores to be "chopped off" along with their L3 slices to form a dual-core die. Additionally, the GPU can be optimized for a particular segment by reducing the number of execution units. Sandy Bridge allows for half of the execution units i.

It's worth pointing out the the non-chop area of the GPU is greater due to the unslice and fixed-media functions of the GPU which are offered on all models regardless of the GPU configuration.

With over a dozen variations possible, Sandy Bridge is shipped in three of those configurations as actual fabricated dies. It's worth pointing out that the each GPU execution unit is roughly 20 million transistors with 6 of them mounting to million.

There is also no quad-core configuration with a low-end GPU version i. Depending on the price segment, additional features may be disabled. For example, in the low-end value chips such as those under the Celeron brand may have more of the L3 cache disabled e. The L3 cache can be fused off by slice with a granularity of 4-way KiB chunks. For example, the full L3 cache is 2 MiB corresponding to way set associative. Whereas a 1. In addition to the cache, both multi-threading and the number of PCIe lanes offered can be disabled or enabled depending on the exact model offered.

Some macrocells remain the same regardless of the die configuration. Below is the percentage breakdown by component for each of the dies.

Values may be rounded and not add to exactly. With the higher level of integration testing also increases in complexity because the ability to observe data signals becomes increasingly complex as those signals are no longer exposed.

Those internal signals are quite valuable because they can provide the designers with an insight into the flow of threads and data among the cores and caches. GDXC allows chip, system or software debuggers to sample the traffic on ring bus including the ring protocol control signals themselves and dump the data to an external analyzer via a dedicated on-package probe array. It's worth pointing out the GDX is inherently vulnerable to a physical port attack if it's made accessible after factory testing.

Sandy Bridge reworked the way clock generation is done. The goal was ensuring uniformity and consistency across all clock domains.

Note that this has changed from MHz in previous architectures. The BCLK is the reference edge for all the clock domains. This was done to ensure clock skew is minimized as much as possible over the different power planes. Additionally, a seperate MHz reference clock is also generated for main memory system.

Because of the way clock generation is done in Sandy Bridge, some overclocking capabilities are no longer possible. Overclocking is generally done on unlocked parts such as the Core iK and Core iK processors. Some initial bad press surrounded Sandy Bridge overclocking capabilities because it was revealed that it can no longer be overclocked.

The overclock-able models can be overclocked using their clock multipliers , this requires the Desktop "P" PCH. Those ratios are not adjustable and scale with the BCLK if overclocked. Power consumption has been a key focus area for Sandy Bridge. The two power vectors that Sandy Bridge tries to address is the active power which is concerned with performance and idle power which is concerned with the average power of battery life.

The Power Control Unit PCU is located at the System Agent which incorporates the various power management hardware logic as well as a dedicated microcontroller which runs firmware that controls the various power features of the device. Communication with the physical cores and the graphics is done via a dedicate power management over the ring.

The unit constantly reads the physical parameters in real time of the parts of the chip allowing it to optimize the power efficiency of the die. The power unit is exposed to the world via a set of external outputs which allows it to interact with rest of the system to control the voltage regulator and an external power management controller. Sandy Bridge has two variable power planes and a single fixed power plane for the System Agent.

The first one covers the ring, cache, and the physical cores. Note that this is a single power plane that is shared by all those components which means they all move together up or down in frequency and voltage. Each of the individual cores is capable of being entirely power gated when needed such as when the core goes into a higher C state.

When this happens, the core state is saved into one of the ways of the cache and the core is entirely shut off. As with the cores, the caches can also be power-gated per way. With each deeper idle state, additional ways are invalidated and flushed and turned off. The integrated graphics has its own variable power plane which can run at entirely different voltage and frequency than the cores. The graphics are not power gated but the voltage is cut off when the graphics needs to go into a sleep state.

The System Agent incorporates a programmable power plane which has a set of predefined voltages which the hardware signals can select from. Optimizing for performance means trying to deliver as much power as possible to demanding components all while meeting stringent constraints.

Power algorithms take into account various constraints when considering what P-State i. Improvements in that area comes from throughput improvement and responsiveness branded under "Turbo Boost 2. In order to optimize the active power, you need to be able to determine the real time power.



0コメント

  • 1000 / 1000