[1] Reiner W. Hartenstein, Jürgen Becker, et al.:
A Reconfigurable Machine for Applications in Image and Video Compression;
Conference on Compression Technologies and Standards for Image and Video
Compression, Amsterdam, The Netherlands, March 1995
This paper presents a reconfigurable machine for applications in image
or video compression. The machine can be used stand alone or as a universal
accelerator co-processor for desktop computers for image processing. It
is well suited for image compression algorithms such as JPEG for still
pictures or for encoding MPEG movies. It provides a much cheaper and more
flexible hardware platform than special image compression ASICs and it
can substantially accelerate desktop computing. ----
paper
062
[2] Reiner W. Hartenstein, Karin Schmidt: Parallelizing
Compilation for a Novel Data-Parallel Architecture; J. P. Gray, F. Naghdy
(Eds.), PCAT-94, Parallel Computing: Technology and Practice, Wollongong,
Australia, pp. 126-137, Nov. 1994
The paper presents a new architectural class of high performance data-parallel
machines, called Xputer. Xputer combine structural programming with traditional
von Neumann control flow programming. From this combination a new programming
paradigm arises which is not familiar to the usual software developer.
To counteract this lack a program partitioning, restructuring, and mapping
method for Xputers has been developed for the input language C. Sources
are restructured and partitioned into an Xputer-suitable execution sequence
providing parallelism at expression and at statement level. Data is mapped
in a regular form onto the Xputer memory space to be accessible by the
Xputers data sequencer hardware which provides a generic set of fast address
sequences. The data operations within each part of the derived execution
sequence are coded as a structural description for further synthesis towards
the reconfigurable ALU which is based on field-programmable logic. Additionally,
assembly code is produced in order to control program execution through
the data sequencer hardware. The entire method performing the paradigm
shift works without further user interaction and all steps are driven by
parameters describing the actual target hardware configuration.
---- paper
058
[3] Karin Schmidt: A Program Restructuring and Mapping
Method for Xputers; Dissertation, University of Kaiserslautern, Dec. 1994.
Xputers are a new architectural class of data-parallel machines, well
suited for algorithms with regular or semi-regular data dependencies. They
combine structural programming with traditional von Neumann control flow
programming. From this combination a new programming paradigm arises which
is not familiar to the usual software developer. Obviously this fact diminishes
acceptance. To counteract this lack a program restructuring and mapping
method for Xputers has been developed, and is presented in this thesis.
To accommodate the majority of programmers C has been chosen as input language.
Sources are restructured using techniques known from supercompilers followed
by partitioning the program source into an Xputer-suitable execution sequence
providing parallelism at expression and at statement level. As in all other
phases these steps are driven by parameters describing the actual target
hardware configuration making the method more flexible to be adjustable
to the whole family of Xputers. Data is mapped in a regular fashion onto
the Xputer memory space to be accessible by the Xputers data sequencer
hardware providing a generic set of fast address sequences. The data operations
within each part of the derived execution sequence are coded as a structural
description for further synthesis towards the reconfigurable ALU. Additionally
assembly code is produced in order to control program execution through
the data sequencer hardware. The entire method performing the paradigm
shift works without further user interaction. ----------
[4] A. Ast, J. Becker, R. W. Hartenstein, et al.:
Data-procedural Languages for FPL-based Machines; 4rd Int. Workshop On
Field Programmable Logic And Applications, FPL'94, Prague, September 7-10,
1994, Lecture Notes in Computer Science, Springer, 1994.
This paper introduces a new high level programming language for a novel
class of computational devices namely data-procedural machines. These machines
are by up to several orders of magnitude more efficient than the von Neumann
paradigm of computers and are as flexible and as universal as computers.
Their efficiency and flexibility is achieved by using field-programmable
logic as the essential technology platform. The paper briefly summarizes
and illustrates the essential new features of this language by means of
two example programs. ---- paper
056
[5] R. W. Hartenstein: Hardware / Software Codesign;
Internal Report No. 246/94, University of Kaiserslautern, 1994.
The paper gives some highlights on a new R&D area called Hardware/Software
Co-Design. It tries to give an answer to several questions. What are the
goals and unsolved problems? What are the hardware platforms? What are
the relations between field-programmable hardware and this new area?
---- paper
053
[6] R. Hartenstein, et al.: A Reconfigurable Data-Driven
ALU for Xputers; IEEE Workshop on FPGAs for Custom Computing Machines,
FCCM'94, Napa, CA., April 1994.
A reconfigurable data-driven datapath architecture for ALUs is presented
which may be used for custom computing machines (CCMs), Xputers (a class
of CCMs) and other adaptable computer systems as well as for rapid prototyping
of high speed datapaths. Fine grained parallelism is achieved by using
simple reconfigurable processing elements which are called datapath units
(DPUs). The word-oriented datapath simplifies the mapping of applications
onto the architecture. Pipelining is supported by the architecture. The
programming environment allows automatic mapping of the operators from
high level descriptions. Two implementations, one by FPGAs and one with
standard cells are shown. ---- paper
052
[7] R. W. Hartenstein, H. Reinig, M. Weber: Design
of an Address Generator; Proceedings 3rd Eurochip Workshop on VLSI Design
Training, Grenoble, September 1992.
This paper describes our experiences with the hardware description language
Verilog during the development of the Xputer prototype. At first it introduces
the novel non-von Neumann architecture of the Xputer, its need for efficient
address generation and the basic structure of the Generic Address Generator.
After a short introduction to Verilog, we discuss the problems with this
hardware description language and show how to get around using some design
restrictions. At the end a outlook on testing and simulation possibilities
is given. ---- paper
047
[8] Alexander G. Hirschbiel: A Novel Processor Architecture
Based on Auto Data Sequencing and Low Level Parallelism; Dissertation,
University of Kaiserslautern, 1991.
The term Xputer stands for a new computational paradigm opening up a
wide design space for many Xputer architectures. One possible implementation
example, the Map-oriented Machine (MoM-2) is used as an example to explain
basic operational principles of Xputers. These are not based on the von
Neumann principles which dominate in contemporary computer systems. A discussion
of these is given showing their throughput problems. In addition, several
speed-up techniques being used in modern von Neumann architectures and
beyond these are shortly discussed. Many Xputer architectures are feasible
and some architectural design alternatives are presented. The performance
derives from the inherent fine grained parallelism and the auto data sequencing
mechanism. The fine grained parallelism results from the reconfigurable
ALU (r-ALU) where programmable compound operators are configured during
compile-time. The auto data sequencing mechanism provides systematic and
optimized methods for data accesses. In Xputers, the data sequencing plays
a central role. On the one hand, the universal address generator enables
the generation features of other systems. On the other hand, the predominant
combinatorial programming requires a new kind of sequential mechanism which
is realized by the address generator. Therefore, it must provide a rich
and
balanced repertoire of scan features. Instead of control-flow, data-flow
is the primary activator in Xputers. Application specific compound operators
are configured within the flexible wide-bandwidth r-ALU at very low level
yielding parallelism at very fine granularity, or ultra micro parallelism.
It is based on a new generation of programmable hardware: the field programmable
media (FPM). The r-ALU replaced the hardwired ALU used in most computer
systems. A new kind of register file organization, the data scan cache,
serves as a window to the data memory. It supports fine granularity data
scheduling strategies which minimize processor/memory traffic. Due to the
compiler-friendly hardware features of Xputers new effective optimization
methods in compilers are possible, which cannot be applied to other computer
systems. ------------------
[9] R. W. Hartenstein, M. Riedmüller, K. Schmidt,
M. Weber: A Novel ASIC Design Approach Based on a New Machine Paradigm;
IEEE Journal of Solid-State Circuits, Vol. 26, No. 7, July 1991; also in:
\x11 Internal Report No. 212/91, University of Kaiserslautern, 1991.
This paper introduces a new design methodology for rapid implementation
of cheap high performance ASICs. The method described here derives from
high level algorithm specifications or from high level source programs
not only the target hardware, but - in contrast to silicon compilers -
at the same time also the machine code to run it. The new method is based
on a novel sequential machine paradigm where execution is used (being by
orders of magnitude more efficient) instead of simulation and where programmers
may do the design job, rather than real hardware designers. The paper illustrates
that for a very large class of commercially important algorithms (DSP,
graphics, image processing and many others) this paradigm is by orders
of magnitude more efficient than the von Neumann paradigm. Compared to
von-Neumann-based implementations acceleration factors of up to more than
2000 have been obtained experimentally. The performance of ASICs obtained
by this new methodology mostly is competitive to ASICs designs obtained
on the much slower and much more expensive "traditional" way. As a by-product
the new methodology also supports the automatic generation of custom computing
machines as accelerators for co-processor use in work stations etc., such
as e. g. to accelerate EDA tools. It is the goal of this paper to explain
the highly efficient application of the xputer paradigm, rather than to
introduce its hardware implementation. It is the goal of this paper to
illustrate the innovative power of this paradigm, and its potential for
a major step of progress toward systematically deriving ASIC designs from
algorithm specifications. ---- paper
040
[10] Michael Weber: An Application Development Method
for Xputers; Dissertation, University of Kaiserslautern, 1990.
An application development environment for xputers is introduced in
this thesis. Xputers are based on a new machine paradigm. Their performance
derives from the inherent ultra micro grained parallelism and the auto
data sequencing mechanism. The ultra micro parallelism results from the
reconfigurable ALU (r-ALU) where programmable compound operators are configured
during compile-time. The auto data sequencing mechanism provides systematic
and optimized methods for data accesses. The Xputer's data sequencing mechanism
is the primary activator for all actions of the system. It reduces control
flow to sparse residual control. The proposed application development environment
comprises a special xputer language and a compilation method for ordinary
programs. The compilation technique is completely based on and driven by
data dependence analysis by adapting the theory of systolic array generation.
But here the extracted parallelism is not laid down into a fixed hardware
structure which cannot be changed any more. Since xputers offer a flexible
reprogrammable hardware platform its compiler-defined structure can be
changed arbitrarily. Thus for the first time the parallelization strategies
of systolizing compilation can be freed from the restrictive control-driven
von Neumann principles. Most of the von Neumann bottlenecks are avoided
resulting in performance figures which can be compared even with ASIC solutions
despite xputers are uni-processors. --------------
[11] A. Ast, R. W. Hartenstein, A. G. Hirschbiel,
M. Riedmüller, K. Schmidt, M. Weber: Using Xputers as Inexpensive
Universal Accelerators in Digital Signal Processing; Bilkent'90 Int. Conference
on New Trends in Communication, Control and Signal Processing; Ankara,
Turkey, 1990; also in: \x11Prepr. Int'l Workshop on Algorithms and Parallel
VLSI Architectures, Pont-à-Mousson, France, 1990.
The paper introduces to xputer use to accelerate digital signal processing
algorithms and other parallel algorithms within a wide variety application
areas. (Xputers are a novel class of high performance processors.) The
programming paradigm, which stems from the deterministically data-driven
xputer machine paradigm, is illustrated by introducing a few xputer application
examples in digital signal processing. The paper first briefly introduces
the novel high performance machine principles. Finally the paper discusses,
how the novel method may be also used for fast and cheap design of ASICs
and highly flexible accelerators, and gives some throughput and hardware
cost figures having been obtained experimentally. ----
paper
025
[12] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
Xputers - An Open Family of Non von Neumann Architectures; Internal Report
No. 195/89, Department of Computer Science, University of Kaiserslautern,
1989.
The paper introduces the principles of xputers - in contrast to the
principles of von Neumann type computers. The paper characterizes a class
of algorithms which run by orders of magnitude faster on xputers than on
computers and explains the novel execution mechanisms of xputers as well
as novel compilation techniques to generate high performance xputer machine
code. The paper proves, that xputers are as universal as computers. Based
on a capacity analysis of communication mechanisms within the hardware
the paper also shows the competitiveness of xputers against MIMD concurrent
computers, VLIW computers, and data flow machines, and illustrates, that
the design space of xputer architectures opens up a promising new area
of research and development in processor architecture. ----
paper
016
[13] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
"MoM - Map Oriented Machine"; Parallel Processing and Applications, North-Holland,
1988, Conference on Parallel Processing and Applications, L'Aquila, Italy,
1987.
In this paper we describe a new architecture, called 'Map Oriented Machine'
(MOM), which fills the gap between the totally flexible but slow von Neumann
computer and the very fast but expensive and inflexible fully parallelized
solution directly implemented on customized silicon. Some applications
of MOM, presented here, show that algorithms which have a map-oriented
organisation, such as image processing, can be implemented in a very efficient
way on MOM. The basic idea of speeding up the algorithms is to parallelize
the program access by combinational hardware, whose development is supported
by some CAD tools. ---- paper
079
[14] R. Hartenstein, T. Mayer: DPLA - Dynamically
Programmable Logic Array; E.I.S Project of the EEC, Multi Project Chip
run 10, submitted to fabrication: June 87, fabricated: 1988.
The DPLA is an SRAM-based pattern matching circuit providing 24 patterns
with 12 bit each. The array may be expanded across chip boundaries by linking
several chip together. The circuit was fabricated at IMS in Duisburg with
a 3 micron NMOS process. The circuit area is 3.6 by 3.5 square mm for 7000
transistors in a 40 pin DIL package. ------------
[15] R. W. Hartenstein, A. G. Hirschbiel, M.Weber:
MoM - Map Oriented Machine; in T. Ambler, P. Agraval, W. Moore (eds.):
Hardware Accelerators for Electrical CAD, Adam Hilger, 1988, also: International
Workshop on Hardware Accelerators, Oxford, September 30 - October 2, 1987.
The paper introduces the MOM (Map-oriented Machine), a reconfigurable
procedurally data-driven machine architecture. A wide variety of problem-oriented
data paths (POLUs) for pattern matching applications can be generated with
a special CAD tool for the MOM - without any need to change the rest of
the machine hardware. The MoM has a two-dimensional memory organization
using a memory buffer, called "scan cache", which operates like a peep
hole to view the memory map. A data sequencer can be programmed to move
this scan cache window along a variety of generic scan paths through the
memory map. The paper also illustrates MOM use for acceleration in image
processing applications and integrated circuit layout rule checking.
---- paper
011
[16] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
A Flexible Architecture for Image Processing; Microprocessing and Microprogramming,
North-Holland, 21, p. 65-72, 1987, also in: Proceedings of the EUROMICRO
Symposium, Portsmouth, UK, 1987.
The paper describes an innovative computation resource concept which
for a class of data processing problems is an alternative to the von Neumann
machine. The 'processor', called 'Map Oriented Machine' (MOM), used for
this concept is faster than a von Neumann-type computer, however, it is
substantially less expensive than a fully parallel hardwired implementation
using full custom or semi custom circuits. Instead of a program store with
a program sequencer a personalized hardware is used, and, to 'program'
this machine CAD tools are used instead of conventional compilers. The
MOM concept is a compromise between the purely sequential von Neumann concept
(sequential control part and sequential data part) and fully parallel solutions
(parallelized control part and parallelized data manipulation side) insofar,
as the control part has been parallelized, the data manipulation side,
however, still uses a universal sequential access organisation.
---- paper
010
[17] R. Hartenstein, R. Hauck, A. Hirschbiel, W.
Nebel, M. Weber: PISA - a CAD package and special hardware for pixel-oriented
layout analysis; Proceedings ICCAD-84, Santa Clara, 1984; also: Internal
Report No. 111/84, Department of Computer Science, University of Kaiserslautern,
1984.
The paper describes a system for pixel-oriented layout analysis. It
may be used as a design rule checker, or to support other types of layout
analysis, such as circuit extraction, electrical rules checking, and others.
Two versions of the system are described: a special hardware version, and
a software version, which may be also used as a CAD tool to personalize
the hardware version of it. -------------------
[18] Reiner W. Hartenstein, Karin Schmidt: Combining
Structural and Procedural Programming by Parallelizing Compilation; Proceedings
of the Symposium on Applied Computing, Nashville, TN, Feb. 1995
A new architectural class of high performance data-parallel machines,
called Xputers, is presented which combine structural programming with
traditional von Neumann control flow (procedural) programming. From this
combination a new programming paradigm arises which is not familiar to
the usual software developer. To counteract this deficiency an automatic
parallelization and compilation method for Xputers has been developed for
the input language C. Sources are restructured and partitioned into an
Xputer-suitable execution sequence providing parallelism at expression
and at statement level. Data is mapped in a regular form onto the Xputer
memory space to be accessible by the Xputers data sequencer hardware which
provides a generic set of fast address sequences. The data operations within
each part of the derived execution sequence are coded as a structural description
for further synthesis towards the reconfigurable ALU which is based on
field-programmable logic. Additionally, assembly code is produced in order
to control program execution through the data sequencer hardware. The entire
method performing the paradigm shift works without further user interaction
and all steps are driven by parameters describing the actual target hardware
configuration. ---- paper
061
[19] R. W. Hartenstein, K. Schmidt: A Restructuring
Compilation Method for the Xputer Paradigm: IWPP 94, Proceedings of th
Int. Workshop on Parallel Processing, Bangalore, India, Dec. 1994
A new architectural class of high performance data-parallel machines,
called Xputers, is presented which combines structural programming with
traditional von Neumann control flow programming. From this combination
a new programming paradigm arises which is not familiar to the usual software
developer. To counteract this drawback a program partitioning, restructuring,
and mapping method for Xputers has been developed for the input language
C. Sources are restructured and partitioned into an Xputer-suitable execution
sequence providing parallelism at expression and at statement level. Data
is mapped in a regular form onto the Xputer memory space to be accessible
by the Xputer's data sequencer hardware which provides a generic set of
fast address sequences. The data operations within each part of the derived
execution sequence are coded as a structural description for further synthesis
towards the reconfigurable ALU which is based on field-programmable logic.
Additionally, assembly code is produced in order to control program execution
through the data sequencer hardware. The entire method performing the paradigm
shift works without further user interaction and all steps are driven by
parameters describing the actual target hardware configuration.
---- paper
058
[20] R. W. Hartenstein, et al.: A New FPGA Architecture
for Word-oriented Datapaths; 4rd Int. Workshop On Field Programmable Logic
And Applications, FPL¹94, Prague, September 7-10, 1994, Lecture Notes
in Computer Science, Springer, 1994; also in: Canadian Workshop on Field-Programmable
Devices, FPD¹94, Kingston, Ontario, June 13-16, 1994
An FPGA architecture (reconfigurable datapath architecture, rDPA) for
word-oriented datapaths is presented, which has been developed to support
a variety of Xputer architectures. In contrast to von Neumann machines
an Xputer architecture strongly supports the concept of the "soft ALU"
(rALU). Fine grained parallelism is achieved by using simple reconfigurable
processing elements which are called datapath units (DPUs). The word-oriented
datapath simplifies the mapping of applications onto the architecture.
Pipelining is supported by the architecture. It is extendable to almost
arbitrarily large arrays and is dynamically reconfigurable in-circuit.
The programming environment allows automatic mapping of the operators from
high level descriptions. The corresponding scheduling techniques for I/O
operations are explained. The rDPA can be used as a reconfigurable ALU
for bus-oriented host based systems as well as for rapid prototyping of
high speed datapaths. ---- paper
054
[21] R. W. Hartenstein, et al.: A Dynamically Reconfigurable
Wavefront Array Architecture for Evaluation of Expressions; Proceedings
of the Int. Conference on Application-Specific Array Processors, ASAP¹94,
San Francisco, IEEE Computer Society Press, Los Alamitos, CA, Aug. 1994.
A reconfigurable wavefront array rDPA (reconfigurable datapath architecture)
for evaluation of any arithmetic and logic expression is presented. Introducing
a global I/O bus to the array simplifies the use as a coprocessor in a
single bus oriented processor system. Fine grained parallelism is achieved
using simple reconfigurable processing elements which are called datapath
units (DPUs). The word-oriented datapath simplifies the mapping of applications
onto the architecture. Pipelining is supported by the architecture. It
is extendible to arbitrarily large arrays and dynamically in-circuit reconfigurable.
The programming environment allows automatic mapping of the operators from
high level descriptions. The corresponding scheduling techniques for I/O
operations are explained. The rDPA can be used as reconfigurable ALU for
bus oriented host based systems as well as for rapid prototyping of high
speed datapaths. ---- paper
055
[22] A. Ast, R. W. Hartenstein, H. Reinig, K. Schmidt,
M. Weber: A General Purpose Xputer Architecture derived from DSP and Image
Processing; in M.A. Bayoumi (ed.): VLSI Design Methodologies for Digital
Signal Processing Architectures, Kluwer Academic Publishers, p. 365-394,
1994.
This paper illustrates a novel class of computational devices called
Xputers, which are by up to several orders of magnitude more efficient
than the von Neumann paradigm of computers. The paradigm is partially based
on using field-programmable logic. The paper shows how the new paradigm
is partly derived from accelerating features of image processors and digital
signal processors, and it illustrates Xputer execution mechanisms and associated
programming techniques by means of simple algorithm examples.
------------------
[23] A. Ast, R. Hartenstein, et al.: Novel High
Performance Machine Paradigms and Fast-Turnaround ASIC Design Methods:
a Consequence of, and, a Challenge to, Field-programmable Logic; Lecture
Notes on Computer Science: H. Grünbacher, R.W. Hartenstein (eds.):
"FPGAs, Architectures and Tools for Rapid Prototyping", Springer Verlag,
1993
New high performance computational paradigms have been introduced, such
as Xputers. Xputers have a reconfigurable ALU using FPGA-like technology.
This results in an efficient novel machine paradigm, competitive to many
ASIC solutions. It permits systematic derivation of machine code from high
level algorithm specs or programs. After testing and debugging real gate
array specs may be derived by retargeting. This is a shortcut on the way
from algorithm to silicon: less effort and shorter time to market. Compared
to conventional ASIC design this means: a) real execution instead of simulation,
b) higher source language level and thus more concise specification.
---------
[24] R. W. Hartenstein: CASHE using a new machine
paradigm; Codes/CASHE Igls 1993, Collection of Foils
The presentation shows a new machine paradigm based on field-programmable
logic for computer aided SW/HW engineering (CASHE). For accelerating bottlenecks
in algorithms, a new procedural machine paradigm is needed, the Xputer
paradigm. This paradigm supports the use of a Œsoft ALU¹ (reconfigurable
ALU). It has a data-procedural execution mechanism and it is deterministic
in contrast to dataflow machines. High performance improvements have been
achieved for a class of regular, scientific computations. The Xputer serves
as a universal accelerator co-processor platform or as a stand alone platform
for embedded systems. It offers new ways to quick ASIC implementation and
new ways to supercomputing. ------------
[25] A. Ast, J. Becker, R. Hartenstein, H. Reinig,
K. Schmidt, M. Weber: XPUTER: ASIC or Standard Circuit?; Invited Paper:
GME Fachtagung "Mikroelektronik" in Dresden 08. 10. 93, 1993.
This paper illustrates an innovative compilation technique which is
important for a novel class of computational devices called Xputers, which
are by up to several orders of magnitude more efficient than the von Neumann
paradigm of computers. Xputers are as flexible and as universal as computers.
The flexibility of Xputers is achieved by using field-programmable logic
(interconnect-reprogrammable media) as the essential technology platform
(whereas the universality of computers stems from using the RAM). The paper
first briefly illustrates the Xputer paradigm as a prerequisite needed
to understand the fundamental issues of this new compilation technology.
----
paper
048
[26] A. Ast, R. Hartenstein, et al.: Novel High
Performance Machine Paradigms and Fast-Turnaround ASIC Design Methods:
a Consequence of, and a Challenge to, Field-programmable Logic; Proceedings
of the 2nd Int. Workshop on Field-Programmable Logic and Applications,
31. 08. - 02. 09. 92, Vienna Austria, 1992.
New high performance computational paradigms have been introduced, such
as Xputers. Xputers have a reconfigurable ALU using FPGA-like technology.
This results in an efficient novel machine paradigm, competitive to many
ASIC solutions. It permits systematic derivation of machine code from high
level algorithm specs or programs. After testing and debugging real gate
array specs may be derived by retargeting. This is a shortcut on the way
from algorithm to silicon: less effort and shorter time to market. Compared
to conventional ASIC design this means: a) real execution instead of simulation,
b) higher source language level and thus more concise specification. ----
paper
044
[27] A. Ast, R. Hartenstein, H. Reinig, K. Schmidt,
M. Weber: A Novel High-performance Machine Paradigm and ASIC Design Methodology;
IEEE Int. Design Automation Workshop ("IEEE Russian Workshop"), 29. - 30.
06. 92, Moskau, 1992.
This paper illustrates an innovative compilation technique which is
important for a novel class of computational devices called Xputers, which
are by up to several orders of magnitude more efficient than the von Neumann
paradigm of computers. Xputers are as flexible and as universal as computers.
The flexibility of Xputers is achieved by using field-programmable logic
(interconnect-reprogrammable media) as the essential technology platform
(whereas the universality of computers stems from using the RAM). The paper
first briefly illustrates the Xputer paradigm as a prerequisite needed
to understand the fundamental issues of this new compilation technology.
----
paper
043
[28] R. Hartenstein, A. Hirschbiel, K. Schmidt,
M. Weber: A Novel Paradigm of Parallel Computation and its Use to Implement
Simple High-Performance-HW, Future Generation Computer Systems 7 91/92,
p. 181-198, North Holland
This paper introduces a novel (non-von Neumann) paradigm of parallel
computation supporting a much more efficient implementation of parallel
algorithms. Acceleration factors of up to more than 2000 have been obtained
experimentally on the MoM architecture for a number of important applications
- although using a hardware being more simple than that of a single RISC
microprocessor. The machine organization and the most important hardware
features of xputers are briefly introduced. The programming paradigm and
its flexibility is illustrated by simple DSP and image processing examples.
----
paper
042
[29] R. W. Hartenstein, K. Schmidt, H. Reinig, M.
Weber: A Novel Compilation Technique for a Machine Paradigm Based on Field-Programmable
Logic; in Will Moore, Wayne Luk (ed.): FPGAs; Oxford 1991 International
Workshop on Field Programmable Logic and Applications, Abingdon EE&CS
Books, Abingdon, 1991.
This paper introduces an innovative compilation technique which is essential
to a novel class of computational devices called Xputers, being by up to
several orders of magnitude more efficient than von Neumann paradigm of
computers. Xputers areas flexible and as universal as computers. But the
central technology platform of flexibility is field-programmable logic
(we would prefer the term interconnect-reprogrammable media), rather than
the RAM which gives the flexibility of computers. The paper first briefly
summarizes the Xputer paradigm as a prerequisite needed to understand the
fundamental issues of this new compilation technology. ---- paper
039
[30] R. W. Hartenstein: Xputer: ein neues Maschinen-Paradigma
für Höchstleistungsrechner; Lessacher Informatik-Kolloquien,
Lessach, Österreich, 18.-21. September 1990.---- paper
038
[31] R. W. Hartenstein, H. Reinig, M. Riedmüller,
K. Schmidt: A Novel Computational Paradigm: Much More Efficient Than Von
Neumann Principles; 13th IMACS World Congress, Dublin Ireland, 1991.
Computers (based on von Neumann principles) are extremely inefficient.
That¹s why this paper introduces a novel computational paradigm based
on new hardware machine principles. Such machines, called "xputers" avoid
most of the bottlenecks known from (von Neumann) computers, so that a hardware
efficiency is obtained which is higher by several orders of magnitude.
By means of a few algorithm examples the new paradigm will be introduced
as a new programming paradigm, which is data-procedural (which is more
direct than the control-procedural von Neumann paradigm). Finally the paper
gives a survey on the novel application development environments needed
for xputers and their advantages over such tools for computers. Such application
support for xputers includes two alternative source levels: high level
programs, or very high level algorithm specifications. ---- paper
037
[32] R. W. Hartenstein, A. G. Hirschbiel, M. Riedmüller,
K. Schmidt, M. Weber: A High Performance Machine Paradigm Based on Auto-Sequencing
Data Memory; HICSS-24, Hawaii Int. Conference on System Sciences, Koloa
Hawaii, 1991. - 2nd best paper award "honourable mention" -
This paper introduces a novel (non-von Neumann) programming paradigm
of parallel computation featuring a much more efficient implementation
of parallel algorithms, as well as a novel (hardware) machine paradigm
efficiently supporting such implementations. Acceleration factors of up
to more than 2000 have been obtained experimentally on an example architecture
for a number of important applications - although using a hardware being
more simple than that of a single RISC microprocessor. Due to its auto-sequencing
data memory the machine principles are partly related to the organization
of associative memories or systems. The machine organization and its most
important hardware features are briefly introduced. The programming paradigm
and its flexibility based on field-programmable logic is illustrated by
a few application examples. ---- paper
033
[33] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
Xputers: Very High Throughput by Innovative Computing Principles; 5th Jerusalem
Conference on Information Technology (JCIT), Jerusalem, Israel, October
1990, Published by IEEE Computer Society, Los Alamitos, CA, USA, 1990,
p. 43-50, 1990.
The paper first introduces the novel machine organization of xputers
- in contrast to von Neumann type computer principles. Then the paper introduces
the novel xputer paradigm as a model to implement parallel algorithms (important
e. g. in image processing, digital signal processing, computer graphics,
VLSI layout verification), to run by orders of magnitude faster on xputers
than on computers. The paper illustrates this model and the novel execution
mechanisms of xputers by a few simple application examples. Xputer principles
are sufficiently simple to open up a large new R&D area to define a
wide variety of innovative architectures. The paper gives some throughput
figures and hardware cost figures having been obtained experimentally from
application examples running an xputer architecture and from code having
been generated by a compiler, both having been implemented. Finally it
discusses technology issues and the use of the xputer paradigm as a novel
method for very fast and cheap design of ASICs. ---- paper
033
[34] R. W. Hartenstein, A. G. Hirschbiel, K. Lemmert,
M. Riedmüller, K. Schmidt, M. Weber: Xputer Use in Image Processing
and Digital Signal Processing; SPIE Visual Communication and Image Processing¹90,
Lausanne, Schweiz, Published by International Society for Optical Engineering,
Bellingham, WA, USA, 1990, p. 778 -789, 1990.
This paper introduces a novel (non-von Neumann) paradigm of parallel
computation supporting a much more efficient implementation of parallel
algorithms. Acceleration factors of up to more than 2000 have been obtained
experimentally on the MoM architecture for a number of important applications.
- although using a processor hardware being more simple than that of a
single RISC microprocessor. The most important hardware features of Xputer
will be briefly introduced. By simple DSP and image processing algorithm
examples the programming paradigm and its flexibility will be illustrated.
----
paper
032
[35] R. W. Hartenstein, A. G. Hirschbiel, K. Schmidt,
M. Weber: A Novel ASIC Design Approach based on a New Machine Paradigm;
European Solid-State Circuits Conference 1990, Grenoble, France.
This paper first introduces a novel machine paradigm as a model for
very high performance implementation of parallel algorithms in important
application areas such as image processing, digital signal processing,
computer graphics, VLSI layout verification, routing and others. The paper
illustrates this model by means of a simple application example. Then the
paper introduces a novel method for fast and cheap design of ASICs and
highly flexible accelerators, which is based on this paradigm. The paper
gives some hardware throughput and hardware cost figures having been obtained
experimentally. ---- paper
030
[36] R. W. Hartenstein, A. G. Hirschbiel, M. Riedmüller,
K. Schmidt, M. Weber: Automatic Synthesis of Cheap Hardware Accelerators
for Signal Processing and Image Preprocessing; 12. DAGM-Symposium Mustererkennung,
Oberkochen-Aalen, 1990. - best paper award -
The paper introduces a novel (non-von Neumann) paradigm of parallel
computation supporting a much more efficient implementation of parallel
algorithms. Acceleration factors of up to more than 2000 have been obtained
experimentally on the MoM architecture for a number of important applications.
- although using a processor hardware being more simple than that of a
single RISC microprocessor. The most important hardware features of Xputer
will be briefly introduced. By simple DSP and image preprocessing algorithm
examples the paradigm and its flexibility will be illustrated. ----
paper
029
[37] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
A Novel Paradigm of Parallel Computation and its Use to Implement Simple
High Performance Hardware; CONPAR¹90 - VAPP IV, Zürich, 1990.
This paper introduces a novel (non-von Neumann) paradigm of parallel
computation supporting a much more efficient implementation of parallel
algorithms. Acceleration factors of up to more than 2000 have been obtained
experimentally on the MoM architecture for a number of important applications.
- although using a processor hardware being more simple than that of a
single RISC microprocessor. The most important hardware features of Xputer
will be briefly introduced. By simple DSP and image processing algorithm
examples the programming paradigm and its flexibility will be illustrated.
----
paper
028
[38] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
The Machine Paradigm of Xputers and its Application to Digital Signal Processing
Acceleration; 1990 Int. Conference on Parallel Processing, St. Charles,
Illinois, 1990.
The paper gives an introduction to using xputers (a novel class of high
performance processors - based on one of the most important machine concepts
since von Neumann) for acceleration of digital signal processing. Its novel
programming paradigm of data sequencing is illustrated by a FFT digital
signal processing example. ---- paper
027
[39] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
Xputers - An Open Family of Non von Neumann Architectures; Proc. of 11th
ITG/GI-Conference: Architektur von Rechensystemen, VDE-Verlag, 1990.
The paper introduces the principles of xputers - in contrast to the
principles of von Neumann type computers. The paper characterizes a class
of algorithms which run by orders of magnitude faster on xputers than on
computers and explains the novel execution mechanisms of xputers as well
as novel compilation techniques to generate high performance xputer machine
code. The paper proves, that xputers are as universal as computers. Based
on a capacity analysis of communication mechanisms within the hardware
the paper also shows the competitiveness of xputers against MIMD concurrent
computers, VLIW computers, and data flow machines, and illustrates, that
the design space of xputer architectures opens up a promising new area
of research and development in processor architecture. ---- paper
023
[40] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
Mapping Systolic Arrays onto the Map-Oriented Machine (MoM); in: McCanny,
McWhirter, Swartzlander: Systolic Array Processors, Prentice Hall, London,
1989.
A method SYS2 to MoM to map systolic systems onto the MoM (map-oriented
machine) is introduced in this paper. This mapping method is needed to
derive a methodology of MoM application development support from the theory
of systolic array synthesis. The MoM is a flexible non-von-Neumann computer
principle having been developed at Kaiserslautern. MoM "programming" uses
combinational code (for path programming) instead of sequential code (for
sequencing). That's why for a wide variety of computation problems the
MoM provides substantial acceleration factors over von Neumann machines.
The MoM can also be used as an inexpensive and highly flexible programmable
pseudo-systolic processor for emulation of systolic arrays, such as e.g.
in experimenting with alternative systolic architectures at very early
phases of the systolic array design process. ---- paper
019
[41] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
A Pseudo Parallel Architecture for Systolic Algorithms; Proc. of the IFIP
Workshop on Parallel Architectures on Silicon, Grenoble, 1989.
This paper introduces a family of non-von-Neumann innovative computing
devices, called Xputers. The map-oriented machine (MoM), an example Xputer
architecture, is a flexible non-von-Neumann accelerator machine having
been developed at Kaiserslautern University. This machine uses a two-dimensional
map-oriented data memory. Over this memory a variable-sized window cache
can be moved in arbitrary move schemes to analyse and change the data via
the problem-oriented logic unit, which delivers powerful, programmable
pattern matching mechanisms. The MoM can be used to speed up signal processing,
image processing and VLSI layout processing and many other applications,
but it may also serve as a systolic array simulator and evaluator. Moreover
it can be also used as a low-cost, simple, flexible, and programmable array
emulation computer. In contrast to a systolic array, where data streams
are moving through an array of PEs, the MoM keeps data at fixed locations
in its memory and moves its scan cache window in an application-specific
manner across this memory space. ---- paper
018
[42] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
A Pseudo Parallel Architecture for Systolic Algorithms; Proc. of the Int.
Conference on VLSI and CAD, Seoul (Korea), 1989.
The map-oriented machine (MoM) is a flexible non-von-Neumann accelerator
having been developed at Kaiserslautern University. This machine uses a
2-dimensional map-oriented data memory, over which a variable-sized window
cache can be moved in arbitrary schemes to analyse and change data via
the problem-oriented logic unit, which delivers powerful, programmable
pattern matching mechanisms. The MoM can be used to speed up signal processing,
image processing, VLSI layout processing and many other applications, and
it may serve as a systolic array simulator and evaluator. Moreover it can
be used as a low-cost, simple, flexible and programmable array emulation
computer. In contrast to systolic arrays, where data streams are moving
through an array of PEs, the MoM keeps data at fixed locations in its memory
and moves its window cache in an application-specific manner across this
memory space. ---- paper
017
[43] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
The Map-oriented Machine (MoM), a Custom-designed Architecture Compared
to Standard Designs; CompEuro 89, IEEE Press, Publ. by IEEE, IEEE Service
Center, Piscataway, NJ, USA, 1989, p 5/7-9, 1989.
The von Neumann principle has 2 bottlenecks: program accessing and data
accessing. An innovative non-von-Neumann principle, having been introduced
at Kaiserslautern, eliminates one of them. Its new processor, the Map-oriented
Machine (MoM), is compared with the von Neumann concept. The MoM is the
key resource to a completely new philosophy of data processing which we
call "map-oriented processing". It does not use sequential programs, since
it has no program sequencer. The way how it is programmed we call Œcombinational
programming¹. For surprisingly many applications it provides acceleration
factors of up to several orders of magnitude, compared to von-Neumann-type
processing. Existing computer application support tools (assemblers, compilers,
operating systems, etc.) cannot be used for the MoM since they produce
sequential code. That¹s why a new programming theory and new application
support tools are introduced such, that the MoM is based on a marriage
between standard IC use and ASIC techniques. ---- paper
010
[44] R. W. Hartenstein, A. G. Hirschbiel, M. Weber:
MoM - Map Oriented Machine, An Innovative Computing Architecture; Internal
Report No. 181/88, University of Kaiserslautern, 1988.
In this paper we describe an innovative computing architecture, called
Map Oriented Machine (MOM). Concerning speed, cost and flexibility today
there mainly exist two extreme solutions: the totally flexible, but slow
von Neumann computer and the very fast, but expensive and inflexible fully
parallelized solution directly implemented on customized silicon. With
some applications running on the MOM we show that it not only fills this
gap, but also is a very good instrument to implement algorithms which have
a map-oriented organization such as image processing. The basic idea of
speeding up the algorithms is to parallelize the program access by combinational
hardware, whose development is supported by some CAD tools. ---- paper
012
[45] R. W. Hartenstein, et al.: A Datapath Synthesis
System for the Reconfigurable Datapath Architecture; Asia and South Pacific
Design Automation Conference, ASP-DAC'95, Nippon Convention Center, Makuhari,
Chiba, Japan, Aug. 29 - Sept. 1, 1995 - xxx
A datapath synthesis system (DPSS) for the reconfigurable datapath architecture
(rDPA) is presented. The DPSS allows automatic mapping of high level descriptions
onto the rDPA without manual interaction. The required algorithms of this
synthesis system are described in detail. Optimization techniques like
loop folding or loop unrolling are sketched. The rDPA is scalable to arbitrarily
large arrays and reconfigurable to be adaptable to the computational problem.
Fine grained parallelism is achieved by using simple reconfigurable processing
elements which are called datapath units (DPUs). The rDPA can be used as
a reconfigurable ALU for bus oriented systems as well as for rapid prototyping
of high speed datapaths. ---- paper
066 - 58
[46] A. Ast, J. Becker, R. Hartenstein, et al.:
MoPL-3: A New High Level Xputer Programming Language; 3rd Int. Workshop
On Field Programmable Logic And Applications, Oxford, 7.-10. September
1993
This paper illustrates a new high level programming language which is
important for a novel class of computational devices called Xputers,
which are by up to several orders of magnitude more efficient than the
von Neumann paradigm of computers. Xputers are as flexible and as universal
as computers. The flexibility of Xputers is achieved by using field-programmable
logic (interconnect-reprogrammable media) as the essential technology platform.
The paper first briefly illustrates the Xputer paradigm as a prerequisite
needed to understand the fundamental issues of this new language.
---- paper
050
[47] Reiner W. Hartenstein, Juergen Becker, et al.:
A Two-Level Hardware/Software Co-Design Framework for Automatic Accelerator
Generation; Workshop on Design Methodologies for Microelectronics, Smolenice
Castle, Slovakia, September 1995
This paper presents a novel hardware/software Co-Design framework CoDe-X
for automatic generation of Xputer based accelerators. CoDe-X accepts C-programs
and carries out both, the host/accelerator partitioning for performance
optimization, and (second level) the parameter-driven sequential/structural
partitioning of the accelerator source code to optimize the utilization
of its reconfigurable datapath resources. --- paper
069
[48] R. Hartenstein, A. Hirschbiel, M. Weber: MOM
- Map Oriented Machine; in: E. Chiricozzi and A. D'Amico: Parallel Processing
and Applications, North-Holland,1988 --- paper
014
[49] Andre DeHon: Reconfigurable Architectures for
General Purpose Computing; MIT, Artificial Intelligence Lab, Tech. Report
1586, September 1996
General-purpose computing devices allow us to (1) customize computation
after fabrication and (2) conserve area by reusing expensive active circuitry
for different functions in time. We define RP-space, a restricted domain
of the general-purpose architectures and account for most of the area overhead
associated with RP devices: (1) instructions which tell the device how
to behave, and (2) flexible interconnect which supports task dependent
dataflow between operations. We can characterize RP-space by the allocation
and structure of these resources and compare the efficiencies of architectural
points across broad application characteristics. Conventional FPGAs fall
at one extreme end of this space and their efficiency ranges over two orders
of magnitude across to pick the space of application characteristics. Understanding
RP-space and its consequences allows us to pick the best architecture for
a task and to search for more robust design points in the space. --------
[50] see [55] Reiner
W. Hartenstein: The Microprocessor is no more General Purpose: why Future
Reconfigurable Platforms will win; invited paper, Proceedings of the International
Conference on Innovative Systems in Silicon, ISIS'97, Austin, Texas, USA,
October 8-10, 1997
The paper is a plaidoyer for a radical methodological change in R&D
of dynamically reconfigurable circuits. The paper illustrates, that the
current main stream approach based on placement and routing is not very
likely to obtain the area-efficiency and throughput needed to cope with
the emerging crisis cost of future silicon technology generations. The
proposed changes include both: architectural principles and fundamental issues in application development
support environments. The paper illustrates the feasibility of general
purpose programmable accelerators and their commercialization. The paper highlights computer systems’ increasing dependency on
add-on accelerators. It shows, why only by a new methodology reconfigurable
hardware will overcome its role as a niche technology and become competitive
to ASICs and other hardwired accelerators. It illustrates the possible
coming crisis of ASIC design based on wasting chip area by placement and
routing and discusses the vision of software-only implementation of accelerators.
---- paper
097 - 55
[51] Reiner W. Hartenstein, Juergen Becker, Michael
Herz, Ulrich Nageldinger: A Novel Universal Sequencer Hardware; Proceedings
of Fachtagung Architekturen von Rechensystemen ARCS'97, Rostock, Germany,
September 8-11, 1997
This paper introduces a powerful novel sequencer hardware for controlling
computational machines and for structured DMA (direct memory access) applications.
The paper introduces the principles and the design of a novel class of
this sequencer hardware which supports two-dimensional memory address space
or at least the two-dimensional visualization of the traditional one-dimensional
address space. From these concepts it derives a classification scheme of
computational sequencing patterns and storage schemes. -----
[52] Reiner W. Hartenstein, Thomas Hoffmann, Michael
Herz, Ulrich Nageldinger: Exploiting Contemporary memory Techniques in
Reconfigurable Accelerators; Proc. FPL-98, Int'l Workshop on Field-Programmable
Logic and Applications, Tallinn, Estonia, Aug./Sept. 1998
This paper discusses the memory interface of custom computing machines.
We present a high speed parallel memory for the MoM-PDA machine, which
is based on the Xputer paradigm. The memory employs DRAMs instead of themore
expensive SRAMs. To enhance the memory bandwidth, we use a threefold approach:
modern memory devices featuring burst mode, an efficient memory architecture
with multiple parallel modules, and memory access optimization for single
a pplications. To exploit the features of the memory architecture, we introduce
a strategy to determine optimized storage schemes for a class of applications.
------
[53] Reiner W. Hartenstein, Michael Herz, Thomas
Hoffmann, Ulrich Nageldinger: On Reconfigurable Co-Processing Units; Proceedings
of Reconfigurable Architectures Workshop (RAW98), held in conjunction with
12th International Parallel Processing Symposium (IPPS-98) and 9th Symposium
on Parallel and Distributed Processing (SPDP-98), Orlando, Florida, USA,
March 30,1998
Proceedings: Jose Rolim (Ed.): Parallel and Distributed Processing,
Lecture Notes in Computer Science 1388, Springer-Verlag, Germany, 1998
In the last years reconfigurable computing grew from a niche application
to an important R&D scene. But also today most architectures lack essential
features for the convenient use as a co-processing unit. E.g. embedded
accelerator design with traditional FPGAs is very similar to sophisticated
ASIC-design due to the bit-level granularity of FPGAs. In this paper important
topics for reconfigurable platforms in multitasking systems are discussed.
Run-time programmability as well as rapid application implementation using
high-level languages are illustrated. Besides the underlying concepts the
hardware implementation of a field-programmable ALU array (FPAA), the KrAA-III,
is explained. --- paper
099
[54] Jürgen Becker, Reiner W. Hartenstein, Michael
Herz, Ulrich Nageldinger: Parallelization in Co-Compilation for Configurable
Accelerators; in proccedings of Asia and South Pacific Design Automation
Conference, ASP-DAC’98, Yokohama, Japan, Feb. 10-13, 1998
The paper introduces a novel co-compiler and its “vertical” parallelization
method, including a general model for co-operating host/accelerator platforms
and a new parallelizing compilation technique derived from it. Small examples
are used for illustration. It explains the exploitation of different levels
of parallelism to achieve optimized speed-ups and hardware resource utilization.
Section II introduces novel vertical parallelization techniques involving
parallelism exploitation at four different levels (task, loop, statement,
and operation level) is explained, achieved by for configurable accelerators.
Finally the results are illustrated by a simple application example. But
first the paper summarizes the fundamentally new dynamically reconfigurable
hardware platform underlying the co-compilation method. -----
[55] Reiner W. Hartenstein: The Microprocessor is no more
General Purpose: why Future Reconfigurable Platforms will win; invited
paper, Proceedings of the International Conference on Innovative Systems
in Silicon, ISIS'97, Austin, Texas, USA, October 8-10, 1997
The paper is a plaidoyer for a radical methodological change in R&D
of dynamically reconfigurable circuits. The paper illustrates, that the
current main stream approach based on placement and routing is not very
likely to obtain the area-efficiency and throughput needed to cope with
the emerging crisis cost of future silicon technology generations. The
proposed changes include both: architectural principles and fundamental
issues in application development support environments. The paper illustrates
the feasibility of general purpose programmable accelerators and their
commercialization. The paper highlights computer systems’ increasing dependency on
add-on accelerators. It shows, why only by a new methodology reconfigurable
hardware will overcome its role as a niche technology and become competitive
to ASICs and other hardwired accelerators. It illustrates the possible
coming crisis of ASIC design based on wasting chip area by placement and
routing and discusses the vision of software-only implementation of accelerators.
---- paper
097
[56] Reiner W. Hartenstein, Juergen Becker, Michael Herz,
Ulrich Nageldinger: A Novel Universal Sequencer Hardware; Proceedings of
Fachtagung Architekturen von Rechensystemen ARCS'97, Rostock, Germany,
September 8-11, 1997
This paper introduces a powerful novel sequencer hardware for controlling
computational machines and for structured DMA (direct memory access) applications.
The paper introduces the principles and the design of a novel class of
this sequencer hardware which supports two-dimensional memory address space
or at least the two-dimensional visualization of the traditional one-dimensional
address space. From these concepts it derives a classification scheme of
computational sequencing patterns and storage schemes. -----
[57] R. Hartenstein, J. Becker, M. Herz, U. Nageldinger:
A Novel Sequencer Hardware for Application Specific Computing; Proceedings
of 11th International Conference on Application-specific systems, Architectures
and Processors, ASAP'97, Zurich, Switzerland, July 14-16, 1997
This paper introduces a powerful novel sequencer for controlling computational
machines and for structured DMA (direct memory access) applications. It
is mainly focused on applications using 2-dimensional memory organization,
where most inherent speed-up is obtained thereof. A classification scheme
of computational sequencing patterns and storage schemes is derived. In
the context of application specific computing the paper illustrates its
usefulness especially for data sequencing - recalling examples hereafter
published earlier, as far as needed for completeness. The paper also discusses,
how the new sequencer hardware provides substantial speed-up compared to
traditional sequencing hardware use. ----
[58] R. W. Hartenstein, et al.: A Datapath Synthesis System
for the Reconfigurable Datapath Architecture; Asia and South Pacific Design
Automation Conference, ASP-DAC'95, Nippon Convention Center, Makuhari,
Chiba, Japan, Aug. 29 - Sept. 1, 1995
A datapath synthesis system (DPSS) for the reconfigurable datapath architecture
(rDPA) is presented. The DPSS allows automatic mapping of high level descriptions
onto the rDPA without manual interaction. The required algorithms of this
synthesis system are described in detail. Optimization techniques like
loop folding or loop unrolling are sketched. The rDPA is scalable to arbitrarily
large arrays and reconfigurable to be adaptable to the computational problem.
Fine grained parallelism is achieved by using simple reconfigurable processing
elements which are called datapath units (DPUs). The rDPA can be used as
a reconfigurable ALU for bus oriented systems as well as for rapid prototyping
of high speed datapaths. ---- paper
066
[59] K. Lemmert: SYS3 Systolic Synthesis System around
KARL, Ph. D. dissertation, Universitaet Kaiserslautern, 1989
This dissertation describes theory and implementation of a systolic
array synthesis programm SYS3 (Systolic Synthesis System), which accepts
a KARL description of an equation system to be implemented and generates
a KARL descriptions of a set of alternative systolic array architectures.
This has in common with compilers for xputers, that its front part operation
is based on data dependency analysis, i. e. the source program is not
interpreted as a sequence spec.
post scriptum (DRAFT)
R. Hartenstein (invited embedded
tutorial): Coarse Grain Reconfigurable Architectures; 6th Asia and South
Pacific Design Automation Conference 2001 (ASP-DAC 2001), January 30 -
February 2, 2001, Pacifico Yokohama, Yokohama, Japan, ---- paper
110
R. Hartenstein, Th. Hoffmann,
U. Nageldinger: Design-Space Exploration of Low Power Coarse Grained Reconfigurable
Datapath Array Architectures; PATMOS 2000 International Workshop - Power
and Timing Modeling, Optimization and Simulation, Göttingen, Germany
- September 13-15, 2000 ---- paper
109
R. Hartenstein, M. Herz, Th. Hoffmann,
U. Nageldinger: Generation of Design Suggestions for Coarse-Grain Reconfigurable
Architectures; 10th International Workshop on Field Programmable Logic
and Applications, FPL '2000, Villach, Austria, Aug.27-30, 2000. ---- paper
108
R. Hartenstein, M. Herz, Th. Hoffmann,
U. Nageldinger: KressArray Xplorer: A New CAD Environment to Optimize Reconfigurable
Datapath Array Architectures; 5th Asia and South Pacific Design Automation
Conference 2000, ASP-DAC 2000, Pacifico Yokohama, Yokohama, Japan, January
25-28, 2000 ---- paper
107
Reiner W. Hartenstein, M. Herz,
T. Hoffmann, U. Nageldinger: Mapping Applications onto reconfigurable KressArrays;
9th International Workshop on Field Programmable Logic and Applications,
FPL '99, Glasgow, UK, Aug.30-Sept.2, 1999 ---- paper
106
Reiner W. Hartenstein, Juergen
Becker, Michael Herz, Ulrich Nageldinger: Data Scheduling in Hardware/Software
Co-Design for Field-programmable Accelerators; Proceedings of 7th International
Workshop on Field Programmable Logic, FPL'97, London, UK, September 1-3,
1997 ---- paper
094