Hardware Upgrade Forum - View Single Post

tuttodigitale · 23-12-2015, 16:16

SLIDE UFFICIALI
13.12.16 RYZEN, 3,4GHz+ base, 25MHz boot step
23.08.16 HOT CHIPS 2016
01.06.16 COMPUTEX 2016: ZEN
20.05.16 AMD Investor Presentation
07.01.16 GlobalFoundries 14nm FINFET
06.05.15 Financial Analist Day

LINK UTILI
https://twitter.com/dresdenboy
http://dresdenboy.blogspot.it

RUMORS

17.11.16 [TECHPOWERUP] versioni e prezzi delle cpu Summit Ridge a partire da 220 dollari

Quote:

Originariamente inviato da techpowerup

Recent reports peg AMD's upcoming line of microprocessors based on Zen micro-architecture as being labelled SR3, SR5 and SR7 for different hardware tiers (with the SR3 being the lowest-performing, and SR7 being, naturally, the highest-performing). A recent post on Chip hell claims that a leaked slide from an AMD presentation give us these insights, with further information on pricing: it's shown in the roadmap that all Zen SR (Summit Ridge) processors will sell for higher than RMB 1500 ($220).

28.10.16 [Bitsandchips] previste 2 versioni di Raven Bridge

Quote:

Originariamente inviato da bitsandchips

However, according to our sources, at the present moment there are two version of Raven Ridge under development, one with a 12CUs GPU and one with a 16CUs GPU. You can see the main differences in the table below.

09.08.16 [Planet3dnow]Primi bench ES ZEN 2,8/3,2GHz

Quote:

Originariamente inviato da capitan_crasy

scaling

06.08.16 [bitsandchips] IMC ddr4 di RAMBUS

Quote:

Originariamente inviato da bitsandchips

L’IMC DDR4 di Zen potrebbe essere di Rambus Technology

20.07.16 Frequenze per ZEN 4, 8, 24, 32 core

Quote:

Originariamente inviato da guru3d

The engineering samples currently are set at revision A0. The user who spread the details talks about four chips with 4, 8, 24 and 32 of cores. The first two SKUs would be for the AM4-socket, while the last two were intended for servers.

The two AM4 chips are quad-core and octa-core with 8 and 16 threads. The quad-core would be get 2 MB L2 cache and 8MB L3 cache, while the octa-core would get double that amount. Both engineering samples currently run a clock speed of 2.8 GHz, with a maximum boost up to 3.2 GHz. The TDP of the two would be 65 watts for the chip with four cores and 95 watts for the octa-core. In idle the clock speed can throttle down back to 550 MHz with an amzing power consumption 2.5 and 5 watts idle power.

For servers there is the SP3 platform. The leaker has details on a 24-core and 32-core chip. The boost clock speed is at the 24-core 2.75 GHz and the 32-core 2.9 GHz. The idle-clock rate is 400 MHz here with even lower. The TDP of the two is 150 and 180 watts respectively.

02.06.16 [techpowerup] Nessun FCH per AM4

Quote:

Originariamente inviato da techpowerup

The AM4 socket sees AMD completely relocate the core-logic (chipset) to the processor's die. Socket AM4 motherboards won't have any chipset on them.

25.03.16 [bitsanchips] nuovo interasse dei fori di montaggio

Quote:

Originariamente inviato da bitsanchips

Se le informazioni giunte in nostro possesso dovessero rivelarsi veritiere (abbiamo fiducia nella nostra fonte, ma errare è umano!), i dissipatori che non fanno parte della categoria dei dissipatori con clip (come il nuovo AMD Wraith) saranno incompatibili con il nuovo Socket AM4.

22.03.16 [bitsandchps]1331 pin per AM4

Quote:

Originariamente inviato da bitsandchips

Ci è stato confermato che il futuro Socket AM4 di AMD sarà di tipo µOPGA, e non LGA (soluzione relegata alle versioni Opteron di Zen), e che avrà ben 1331 pin.

29.02.16 [Dresdenboy] Nuovi dettagli da Dresdenboy

Quote:

Originariamente inviato da dresdenboy

The interconnect subsystem is called "Data Fabric", which knows so called coherent slaves according to the last enumeration list.
There is a new L0 ITLB, which is the only level 0 thing being mentioned so far, while VR World mentioned level 0 caches (besides other somewhat strange rumoured facts like no L3 cache in the APU variant - while this has been shown on the leaked Fudzilla slide). The only thing resembling such a L0 cache is a uOp cache, which has clearly been named in the new patch in a section related to the decode/dispatch block (indicated by "de"):

There are strings for both a "uop cache" and a "uop buffer". So far I knew about this uop buffer patent filed by AMD in 2012, which describes different related techniques aimed at saving power, e.g. when executing loops or to keep the buffer physically small by leaving immediate and displacement data of decoded instructions in an instruction byte buffer ("Insn buffer") sitting between instruction fetch and decode. The "uop cache" clearly seems to be a separate unit. Even without knowing how many uops per cycle can be provided by that cache, it will help to save power and remove an occaisional fetch/decode bottleneck when running two threads. The next interesting block is about the execution units:

Here is a first confirmation of a checkpoint mechanism. This has been described in several patents and might also be an enabler for hardware transactional memory, which has been proposed in the form of ASF back in 2009. Another use case is the quick recovery from branch mispredictions, where program flow can be redirected to a checkpoint created right before evaluating a difficult to predict branch condition.

There is a confirmation of the "GMI link".

Notable changes are:
uOp Cache has been added based on the new patch
FMUL/FADD for FMAC pairing removed, based on some corrections of the znver1 pipeline description.
4x parallel Page Table Walkers added, based on US20150121046
128b FP datapaths (also to/from the L1 D$) based on "direct" decode for 128b wide SIMD and "double" decode for 256b AVX/AVX2 instructions
32kB L1 I$ has been mentioned in some patents. With enough ways, a fast L2$ and a uOp cache this should be enough, I think.
issue port descriptions and more data paths added
2R1W and 4 cycle load-to-use-latency added for the L1 D$ based on info found on a LinkedIn profile and the given cylce differences in the znver1 pipeline description
Stack Cache speculatively added based on patents and some interesting papers. This doesn't help so much with performance, but a lot with power efficiency

12.02.16 [CERN]32 core, 6-wide & cache L0

Quote:

Originariamente inviato da CERN

ZEN High End ‘Exascale’ CPU, 1-4 Socket (1P-4P) – specs as per CERN

Multi-Chip Module (2×16-core)
32 ZEN x86 Core, 6-wide
128 KB L0 Cache (4KB per core)
2 MB L1 D-Cache (64KB per core)
2 MB L1 I-Cache (64 KB per core)
16 MB L2 Cache (512 KB per core)
64 MB L3 Cache (8MB cluster per quad unit)
576-bit Memory Controller (two times 4×72-bit, 64-bit + 8-bit ECC)
204.8 GB/s via DDR4-3200 (ECC Off, 102.4 GB/s per die)
170.6 GB/s via DDR4-2666 (ECC On, 85.3 GB/s per die)

16.01.16 [AMD-Raja koduri]AMD Ultra Wide-Band

Quote:

Originariamente inviato da wccftech.com

PCI-Express is already seen as a bottleneck when connecting several nodes in high-performance sectors. AMD sees their current PCI-e and CrossFire solutions not working with next generation machines hence they have to design a new coherent fabric. The interconnect will offer speeds of 100 GB/s across multiple GPUs and APUs that are featured inside AMD powered compute machines and will deploy some open standards. Asking if the interconnect will also maintain memory coherency and sharing between the GPUs and CPUs, Raja stated that he can’t reveal that right now but will definitely have a detailed showcase of their coherent fabric later on as coherency between their several chip designs is being kept in mind.

Quote:

Originariamente inviato da slide fudzilla

03.11.15 [Dresdenboy] 10 Pipelines per core

Quote:

Originariamente inviato da Dresdenboy

As heard earlier this year, Zen will use SMT and an improved cache subsystem while being designed from scratch with new ideas combined with reusing existing components (to reduce the effort). This might even include already existing and somewhat developed ideas not realized in previous designs. A lot of the new functionality has been filed for patenting. For example there was a mention of checkpointing, which is good for quick reversion of mispredicted branches and other reasons for restarting the pipelines. Some patents suggest, that Zen might use some slightly modified Excavator branch prediction.

Here are some quotes of the patch file:

+;; Decoders unit has 4 decoders and all of them can decode fast path
+;; and vector type instructions.
+;; Integer unit 4 ALU pipes.
+;; 2 AGU pipes.
+;; Floating point unit 4 FP pipes.
+ 32, /* size of l1 cache. */
+ 512, /* size of l2 cache. */

Excerpt:
4 wide decoders
4 integer ALUs
2 AGUs (for 2R 1W L1 cache according to a LinkedIn profile)
4 FP pipelines
That makes z ten pipelines with a general four wide design.

20.04.15 [Fudzilla] Opteron 32 core ZEN

Quote:

Originariamente inviato da Fudzilla

Just like the 16 Zen core high performance market APU, each core has 512KB of L2 cache and four processors share 8MB L3 cache. The highest end part will come with eight clusters of 4 cores and if you do the math this server oriented CPU will come with 64MB of L2 cache and 16MB of L2 cache for its CPU cores.

A few other notable features for the next generation server parts include a new platform security processor that enables secure boot and crypto coprocessor. The next generation Opteron has eight DDR4 memory channels capable of handling 256GB per channel. The chipset supports PCIe Gen 3 SATA, 4x10GbE Gig Ethernet and Sever controller HUB. Of course, there will be a SMP, dual socket version.

The next generation Opteron will have 32 CPU cores in its highest end iteration, and we expect some Stock Keeping Units (SKUs) with fewer cores than that for inexpensive solutions.

10.04.15 [Fudzilla]APU HPC 16 core ZEN

Quote:

Originariamente inviato da Fudzilla

The new APU platform has everything AMD fans could wish for - four channel DDR4 support, PCIe3, up to 16 Zen cores and Greenland GPU, paired with High Bandwidth Memory (HBM). The ability to add up to 16 Zen CPU cores suggests that AMD plans to use this chip for the compute market too, as the marriage of 16 Zen processors and HBM powered Greenland graphics can probably score some amazing compute performance numbers