Many-core Mania

DAC-44 notes

Friday Jun 15, 2007

I was at the DAC (Design Automation Conf.), the biggest trade conference in the EDA industry held at San Diego last week. Presented a paper from my PhD research work (shameless plug to my dissertation summary). Overall it was a very exciting conference. Here are some of the notes I made from the sessions I sat in during DAC last week.

Tuesday: 06/05


  1. Panel Session: Mega Trends and EDA 2017

    1. Moore's law will continue for next 10 yrs; corollary is
      the law for manycore processors (might see 500+).
    2. EDA/semicon industry for next 10 yrs will be at 8%
      (single digit) growth; Not the era of 14-15% growth of
      90's.
    3. Industry in "golden age" after initial stage of
      irruption, tech bubble, etc.
    4. More interaction/convergence b/w foundries and EDA tools
      companies
    5. Power, DFM, DFY major issues, but not show-stoppers;
      consensus that solutions will be found.
    6. Application driver may not be consumer electronics but
      healthcare, auto, etc. (Juan Antonio Cabarallo, Aragon
      Venture)
    7. Convergence of design platforms; consolidation across
      semicon and EDA companies (Cabarallo)
    8. Will see emergence of more application software
      development (perf. optimizers, tuners, etc) by EDA
      companies (Prof. Kurt Keutzer, Berkeley)
    9. Techeconomics will drive the respective industries (Aart
      de Gaeus, Synopsys)

  2. 2. Paper session: Leakage Power Analysis and Optimization

    1. Paper #3: modeled dependence of Vt on width, useful in
      the context of wider transistors.

  3. 3. Panel session: Early power-aware design and validation, myth or
    reality

    1. Consensus that it is largely a reality as several
      designs use it successfully
    2. No consensus on what accuracy can/will be achieved
    3. Data quality bad in arch.-level power but early
      power-aware design does help and works in real design.
      Start with the following: i. Arch. performance model ii. Block-level activity from simulator, iii. Switching cap extrapolate from previous generation design
    4. Silicon within 10% arch. estimates obtained with
      intensive high-level modeling effort (Steve Curtis,
      Intel)
    5. High-level power estimation/opt tools confined to
      startups; big EDA companies not betting on it since not
      sure when system-level design will take off (E. Macii,
      Univ. Torino)
    6. Lack of standard format for exchanging power info;
      Accellera UPF expected to make things better
    7. Limited availability of info. for characterizing power
      models because knowledge of the IP is needed (companies
      not ready to share)


Wednesday: 06/06

  1. Paper session: Process-aware physical design

    1. Paper #1: proposes buffer insertion under process
      variations. First, do slew-constrained buffering, then
      iteratively check timing and do delay-constrained
      buffering until timing is satisfied.


Thursday: 06/07

  1. Special Session: Thousand core chips

    1. Shekar Borkar, Intel:

      1. 45nm can integrate 8 bil transistors; 2014 we
        will hit terascale (100 bil); integration
        capability will facilitate 1K cores in 10 yrs
      2. transistors moving to trigate; delay and energy
        scaling will slow down; 1.25x realistic freq.
        scaling; 0.7x vdd scaling; 1000 core chips will
        have 1000w of power
      3. Pollacks rule: in single core, 2x (power or
        area) = 1.4x perf.
      4. multi-core chips will have general purpose,
        special purpose procs and connected by
        interconnect fabric; opportunity for
        fine-grained pwr. mgmt. in multicore; parallel
        s/w key to multicore success.
      5. Having many small cores will give better
        performance but parallelism in applications
        should double every generation to break even;
        means that s/w shd be able to parallelize better
      6. How to feed the beast? 100GBps BW may be needed.
        I/O power at this rate will be 25W using ~400
        differential pins. Reduce distance between cores
        to abt. 1-2mm; 3d chips with DRAM on bottom
        probably best solution.
      7. Resilient uarch in manycore needed because
        components will get more and more unreliable
        (soft errors, variability, aging, etc.);
        possible solutions: dynamic on-chip testing,
        redundancy, etc.

    2. Anant Agarwal, MIT:

      1. KILL rule for multicore (Kill if less than
        linear): A resource in a core (power or area)
        must be inc. in area only if the core's perf.
        improvement is more than the inc. in the
        resource.

    3. Wen-Mei Hwu, UIUC:

      1. Parallel prog models needed for >4 cores. Models
        should last for many generations of multicore.

      2. Implicitly parallel prog implies APIs managing
        parallelism. Explicitly parallel prog implies
        programmers managing parallelism. In former,
        algorithm should be parallel but can be written
        in seq. language. Latter is rigid (need to
        change as # of cores multiply) and not
        advisable.

    4. 4. John Deringer, IBM (EDA for Multi-core):

      1. Growing use of diverse modular design
      2. EDA challenges: custom design efficiency,
        ASIC-style productivity, design and verification


[0] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg