Re: Mainframe not a good architecture for interactive was Re: What is the future of COBOL? Answer: Irrelevant???

From: Peter E.C. Dashwood (dashwood_at_enternet.co.nz)
Date: 12/19/03


Date: Fri, 19 Dec 2003 22:10:32 +1300


"Steve Thompson" <steve_nospam_t@ix.netcom.com> wrote in message
news:MPG.1a4c36ca475b07759896a4@News.CIS.DFN.DE...
> In article <3fda9e76_6@news.athenanews.com>,
> dashwood@enternet.co.nz says...
> <snip>
> > > There is also the VERY real issue that PC's cannot handle
> > > high speed I/O to DASD or networks, nothing at all compared
> > > to even a modest mainframe.
> > >
> >
> > See above.
> <snip>
>
> Any time you are ready to eat crow, humble pie, etc., let me
> know, I'll do my best to help you out. ;-)
>

Well, I'm not hungry at the moment, thank you...

> Just to give you a few ideas --
>
> PC is BUS centric. When bus is busy, nothing else can transfer
> data.

Really? So what do the letters DMA mean to you (Disastrously Muddled
Architecture perhaps...?)

> So only one device adapter can be functioning at a time
> (even with bus mastering). 30-60MB/sec bursts are what can be
> done?
>
> Mainframes are MEMORY centric using ECC type memory. When the
> memory is being it is only being accessed by one of n ports (n
> port memory) which means that n-1 other processors can be
> accessing it!
>

Did you know that ECC works on binary polynomials? It is a slow but
effective way of ensuring correct results. (Faster since they put hardware
in to support it...<G>)

> Now, without going into bus speeds and widths, each I/O Channel
> is a processor.

On SOME architectures. Are you including Block MUX in this?

> That means that each I/O channel can be doing
> asynchronous I/O independent of any other channel or CPU.
>
Yes, it generally does. PCs achieve the same effect with a chipset (see
below).

> Now with buffering techniques, 8 channels at 20MB/sec sustained
> rate(s), how much data can a modest mainframe (old, with
> original fibre optics) handle when the channels are at 50%
> capacity? Now look at a z/800 with 4 FICON channels (current
> fibre based channels) can do -- somewhere over 120MB/sec
> sustained.
>

120MB per second? That's your best shot...?

Supposing I said there are Intel I/O chips that can maintain around a GIG
per second? (Nearly 9 TIMES FASTER, in case you are too stunned to
calculate...)?

> I'm just warming up. I could get the actual data. How about you
> supplying the specs on the Intel based motherboard of your
> choice (I don't care how many CPUs it has).

OK. I choose the Intel IOP321...It isn't even a Motherboard; you could place
SEVERAL of these on a Motherboard...

The Intel IOP321 I/O processor is Intel's fifth generation I/O processor. It
is the first I/O processor to integrate an Intel XScale microarchitecture
core and a PCI-X interface. Many storage, networking, and embedded
applications require fast I/O throughput for optimal performance. The IOP321
is a highly integrated, cost-effective I/O system on a chip that delivers a
two-fold performance boost over its predecessor, the Intel IOP310 I/O
processor chipset, in I/O-intensive applications.

The IOP321 is especially well suited to networked storage applications
including RAID (Redundant Array of Independent Disks) adapter cards, ROMB
(RAID on motherboard), and other storage applications. Its small package
size, high data throughput, and integrated AAU/XOR provide an optimized
solution for these applications. In addition, the IOP321 is an ideal choice
for applications requiring a high performance I/O subsystem in a
tightly-integrated environment.

The IOP321 introduces a powerful combination of technical advancements. The
133 MHz PCI-X interface achieves up to 1 Gbyte per second throughput, a
two-fold increase over 66 MHz PCI. The internal bus operates at 200 MHz and
offers internal bandwidth of up to 1.6 Gbytes/second. The IOP321 also
features a 200 MHz DDF SDRAM controller with ECC that supports up to 1 Gbyte
of 64-bit DDR SDRAM, two times that of the previous generation. It contains
a 100 MHz, 32-bit local bus that is excellent for embedded applications
requiring a connection to non-PCI peripheral components such as ASICs, flash
memory, or DSPs.

The IOP321 has additional features that accelerate I/O throughput. A
2-channel DMA controller facilitates increased PCI-to-memory throughput and
memory-to-memory throughput. The application accelerator unit contains a
hardware-based XOR capability and a 1 Kbyte queue to accelerate RAID-related
parity calculations. The application accelerator speeds transfer of read and
write data to the memory controller and computes data parity across local
memory blocks.

Product highlights
  a.. 32-bit high-performance CPU (400, 600 MHz) based upon Intel XScale
microarchitecture

  b.. Integrated 64-bit PCI-X interface (PCI 1.0A, PCI 2.2)

  c.. 200 MHz DDR SDRAM with ECC (1 GB of 64-bit memory, 32-bit mode
supported)

  d.. Intel® Superpipelined RISC technology (7-stage integer, 8-stage
memory)

  e.. 32 KB data cache, 32 KB instruction cache

  f.. 2 KB mini-data cache

  g.. ARM Version 5TE compliant

  h.. 32-bit local bus (100 MHz) / Flash I/F

  i.. 1.6 GB/s internal bus (200 MHz)

  j.. 2 DMA channels

  k.. 2 Serial (I2C) + SPI Port

  l.. Application Accelerator Unit with hardware-based XOR capability and 1
Kbyte queue

  m.. Watchdog timer, 2 programmable timers (Auto-reload, programmed
duration, selectable prescaling)

  n.. 1024-byte DMA and 4096-byte ATU buffers

  o.. 8 general-purpose I/O pins, 4 SDRAM output clocks, and integrated
timers

  p.. Performance Monitoring Unit

  q.. 544L PBGA (35mm)
(BTW, I don't intend to pursue this argument and I have no intention of
getting into a pissing contest on fastest I/O.

I have responded because you asked for me to support my argument and it is
obvious to me that your knowledge of technology advancement is sadly lacking
in some areas. (Areas where your bias and preconceived ideas apparently
forbid you to look...)).

I am not "anti-mainframe" (made a living off them for over 20 years) and I
really don't care whether people think they are better suited to networks,
and PCs are just toy boxes with no "real" computer power, or not.

My post (if you had read all of it) was fair and balanced. My main criticism
was levelled at CICS rather than mainframe hardware.

BOTTOM LINE: Mainframes are currently NOT a good architecture for
interactive networked applications. (Insofar as they suffer by comparison to
systems that ARE a good architecture for interactive networked
applications.)

Neither in Hardware terms, nor in Software design. End of Story.

Now, maybe we can share some pie?<G>

Pete.



Relevant Pages

  • Re: The performance and behaviour of the anti-fragmentation related patches
    ... we can stay out of the hair of those applications and they can stay out ... Huge pages cannot do I/O so we would get back to the gazillions of pages ... In case of a system with 1 petabyte of memory this may be rather ... coalescing of the page structs into bios in hardware or some such thing? ...
    (Linux-Kernel)
  • Re: readahead(2) - Linux
    ... In an ideal world, a prefetch system call doesn't actually force the I/O to happen, it just hints that if it did happen, life would then be better. ... Then, in said ideal world, the VM system can juggle investing pages in memory and I/O capacity in heuristic read-ahead, prefetch hints from the application, anonymously process memory, and buffer cache, based on what is most effective for particular applications or workloads. ... Last time I read up on I/O prefetching literature, it was considered quite difficult to place prefetch calls in applications in a useful way, and that normal heuristic read-ahead, which we already support, actually caught a high percentage of real application cases since applications do tend to order and store data usefully in files. ...
    (freebsd-hackers)
  • Re: xmalloc string functions
    ... require memory allocations depending on the way the system works. ... If the toolkit being used is not one of those, then it is irrelevant that some provide a means to do so, particularly if the "some" are not available for the platform being targeted. ... Not enough context for most real-world applications to recover at this point. ... At this point g_malloccalling abortbecomes a moot point, particularly if your auto-save code is robust against memory allocation errors. ...
    (comp.lang.c)
  • [RFC] page replacement requirements
    ... Submitting too much I/O at once can kill latency and even lead to deadlocks when bounce buffers are involved. ... Must be able to deal with multiple memory zones efficiently. ... When on completion of the write to their backing-store the reference bit is still unset a callback is invoked to place them so that they are immediate candidates for reclaim again. ... For traditional page replacement algorithms this is not a big issue since we just implement per zone page replacement; ...
    (Linux-Kernel)
  • RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)
    ... answered I disliked the dependency on defrag for reliable I/O and I ... all the memory allocations are ... the moment you need to relay on order> 0 allocations ... printf("%d usecn", usec); ...
    (Linux-Kernel)