Re: Blue Chip Technology + MagnumX?



On Wednesday, in article <4kg96oFc3id2U1@xxxxxxxxxxxxxx>
dave@xxxxxxxxxxxxxxxxxxxx "David Hearn" wrote:
Paul Carpenter wrote:
On Tuesday, in article <4kdgojFbg7edU1@xxxxxxxxxxxxxx>
dave@xxxxxxxxxxxxxxxxxxxx "David Hearn" wrote:
John Devereux wrote:
David Hearn <dave@xxxxxxxxxxxxxxxxxxxx> writes:
John Devereux wrote:
David Hearn <dave@xxxxxxxxxxxxxxxxxxxx> writes:
Has anyone got any experience of Blue Chip Technology
(http://www.bluechiptechnology.co.uk/) and especially their MagnumX
single board computers?
....
Our app is quite simple - loop until a counter expires, each iteration
store 1 byte pin state register and 32 bit counter value. We
currently can sample up to 60MB of data.
Can you not use DMA instead? And/Or perhaps an external FIFO.
Personally, I have no idea. I've never used DMA before (I'm
traditionally a desktop app guy!) - so I'm not sure where I would
start, or what I'd need to do. Essentially, we just want to move a 8
bit GPIO register containing pin status and a 32 bit register
containing a counter value into RAM as fast as possible - at around
3.3MHz (sample every 300ns).
Paul Carpenter beat me to to most of the points I would make. I would
just wonder whether you actually need the counter - if the samples are
being aquired reliably then perhaps the sample number can give you the
time?
Yes, if the samples are being taken at guaranteed periodic intervals
then use of the counter is not needed.

The software counter, only *guarantees* that a counter variable counts
it can not under all circumstances guarantee anything else.

The counter used decrements once per (133MHz) clock tick. Difference
between counter values can then be used to determine period between samples.

If you read a counter (assuming it is an atomic operation no changes in
counter whilst reading bytes/words etc), then depending on how synchronised
the software is to the [assumed] hardware counter you are at best +/- 1 count
from the time the counter 'expired'. So sampling has jitter on it.

If the counter 'expires' by counting down and stopping then any number of
effective counts could have occured not decrementing the counter, so sampling
has even worse jitter on it.

....
What is the maximum frequency of change of ANY digital input compared
to desired sampling frequency?

There are 5 digital inputs. Currently 4 of them are sampled perfectly
fine and provide the data we're after. The issue is with 1 input which
can run around 850kHz if data is present. It's not a stable line (ie.
not a 850kHz clock) but has frequent phase changes - but the max is
around 850kHz.

To what accuracy of timing do you have to measure these changes?
I.E. 10ms, 1ms, 100us, 10us, 1us, 100ns? These effect how good the sampling
method is.

At present, the 2.xMHz we're sampling at isn't quite fast enough to get
all the transitions. On the other 4 lines the speed is fine.

That can be down to architecture and how the port pins are read as well
as frequency of input changes on all the lines.

To try and speed things up (ie. reduce memory accesses and
conditionals), we started just sampling the pin state, but not filtering
out any non-changes - therefore making the code path in each iteration
of the loop identical. This, in theory, should give the same sample
rate each time, once the 'fudge' factor (time taken for each loop
iteration) is calculated.

If you can guarantee that NO other actions could EVER occur on your
board, causing the micro or its support hardware to put in waits
and chnage your timing.

So far we've not seen any signs that this is happening, or at least, if
it's happening, it's not affecting the quality of our (4 line) samples.

You have a single board system with 64MB of DRAM, Dynamic RAM has to be
refreshed and depending on the type this is done autmatically by hardware
in the system. This refresh regime may depending on processor delay all
software, delay dynamic memory accesses or extend timing at times you are
not expecting.

For example it could delay all software just as your timer 'expires' so
many effective counts occur before the software resumes with jitter on
sampling point. Alternatively it may well delay a write to DRAM, which
extends the loop time.

Also I find it difficult to believe that on a board with ethernet, USB
and serial that you are not using your application under an operating system
which will have its own events happening. If not now it will do when you
want to do something with the acquired data.

As the processor is probably a 32bit processor I would consider taking 4
samples and creating a single 32bit value in data cache to be written to
external memory as one 32bit operation. Instructions and data operating
in cache will be less susceptable to outside influences and writing to memory
will have less issues.

I believe this worked well - however the sample rate was still only
fractionally faster than 2MHz - not close to the (approx) 3.3MHz we'd like.

The average sample rate, over a period of time, we have no idea what accuracy
of capture timing you are trying to achieve.

At the rates of capture you are trying to achieve I would NEVER do it by
software polling.

Using the same code in an interrupt handler (store byte in array,
increment index) was taking longer than polling.

Because of interupt latencies and overheads in processing interupts. Interupts
are not meant for this type of operating frequency.

........

We originally tried using a timer to generate interrupts and to sample
at a constant rate, however, we found that the sampling speed was far
slower than using a simple while() loop - too slow for our needs. To

Probably too much code in the interupt routine and interupt latencies.
The timer should be DIRECTLY strobing the data capture.

See above about code in interrupt handler and performance issues.

Interupt latencies and overheads probably due to stack saving of ALL registers
and restoring amongst other issues, which could be OS as well.

further complicate matters, turning full optimisation on (using gcc for
M68k) caused interrupts to break. Using full optimisation made a
noticeable difference in speed when doing the simple while() loop.

Sounds more likely to be improper use of things like volatile and other
software design problems.

If -O3 is enabled, then when entering an interrupt handler and then
clearing/acknowledging the interrupt, when exiting the same interrupt
got fired again - implying that the clearing of the interrupt never
occurred. Turning it down to -O1 (and no other changes) again made it
work. Register .h file definitions are from Freescale/LogicPD provided
SDK/header files.

Which suggests optimising a bit clear and bit set of the same bit away,
which still could be that some register or variable used to create a copy
of the register was not declared as volatile.

Unfortunately I have little control over modifications of the hardware
(ie. adding FIFOs etc). Whilst I can advise what might be a better off
the shelf choice, custom changes to boards is unlikely to happen at this
stage (proof of concept and initial development).

Well if you don't sort out the data acquisition properly BEFORE deciding
on processor, it is akin to saying

"I have this wonderful MP3 player which needs an Edison Wax Cylinder
interface via USB without designing the hardware to interface it."

Proof of concepts have always in my opinion have to show what is and
what is NOT needed. If you approach the problem with the wrong tool
then you will need to add something to kludge the lot together.

All you are proving at the moment is that we can throw money at buying
bits and any bits, that if we keep buying bits we will force the problem
to be solved.

There are two options we are considering to solve the problem.

1.) Sample faster (from 2.xMHz to 3.3MHz)
2.) Get the designer of the board generating the digital inputs to
detect the phase changes and have that as the 5th digital input, rather
than the 850kHz data we're getting. This would mean our current
implementation will be suitable.

Depending on what accuracy you want to sample the changes to.

........

I appreciate all your help - but really, at the moment, we're limited to
using an off-the-shelf processor board - it's just knowing which is
suitable. We're 2/3 of the speed we need at present - so not a huge
amount to make up - it feels as if we should be able to do it using our
current 'design' - just wondering whether other hardware would have the
speed increase we're looking for.

Other hardware added to your board or even much lesser processor cards
will easily do this FUNCTION. Other hardware will do something your
hardware will not do, guarantee the sample to sample timing ensuring
that it stays that way is moving the data as fast as possible into
memory to keep up with sampling regime.

I don't disagree with this statement.

You appear not to have done any capture timing tests to see how regular
your digital capture is from sample to sample and whether it meets any
part of your spec.

We have done this, and confirmed that 4 of the 5 digital inputs are
being sampled with sufficient quality to consistently provide the data
we're looking for.

All you have at the moment shown to me is that you have a bunch of data
acquired at AVERAGE clock rates.

Which is fine for our purpose (except the average rate is too slow for
one of the inputs, which we may be able to solve with a change elsewhere).
......
Incidentally, I got a reply from Blue Chip Technology regarding my
question to them. They were very helpful and said:

"Unfortunately, the MagnumX will not do what you are looking for.
On the board the GPIO lines run on the I2C bus, and the chipset
manufacturers recommendation is that operation of these lines should be
around 5Hz max, well below what you are looking for.

Looking at your requirements, our design engineers have suggested that
you look at some form of Bus Mastering Data Acquisition card to get the
speed you are looking for. Even then, with the 33Mhz PCI bus, and a PCI
Read taking about 10 clock cycles per register, you are right on the
limits of what the PCI bus can cope with.

I'm sorry but we cannot even think of any Data acquisition card that may
be suitable"

I suspect a typo in the 5Hz max - but I believe the meaning is still the
same.

I suspect they mean 5kHz. But it could be 5Hz depending on the system
and software being used to drive it under what they consider normal
conditions.

Looks like a change to the input board may be the best option (if it's
possible) as then we'd met our requirements and got the necessary data
sampled to the resolution and quality we require.

Consider getting something that can pack the data into 16/32bits then
you may be able to get the speed up easier. Especially if it is a bus
mastering device.

--
Paul Carpenter | paul@xxxxxxxxxxxxxxxxxxxxxxxxxxx
<http://www.pcserviceselectronics.co.uk/> PC Services
<http://www.gnuh8.org.uk/> GNU H8 & mailing list info
<http://www.badweb.org.uk/> For those web sites you hate

.



Relevant Pages

  • Re: How to debug inside the BIOS and/or interrupt?
    ... I'm not sure I can help with your specific code problem but if the program works when you read only one chunk of data at a time from disk, but locks up after a while if you read multiple chunks, then probably one of two things is happening -- some memory is being corrupted or an interrupt is taking too long somewhere. ... one nice way to debug interrupt problems on such hardware is to modify your interrupt routine so that it toggles a pin that you can look at with a 'scope. ... That's usually not wise to attempt within an interrupt but you can simulate such a thing by doing something like writing to a fixed, preallocated error log in memory, or to a hardware register. ... Again you might find the printer port convenient for this, or if you have at least an EGA compatible display on your hardware, you can do things like change the background color of the screen by writing to the overcan register. ...
    (comp.lang.asm.x86)
  • Re: em network issues
    ... enough to produce watchdog timeouts after a few seconds. ... (as is automatic in fast interrupt handlers) ... from one register, and writes to no registers or shared memory. ... I am still very surprised that the hardware ...
    (freebsd-net)
  • Re: [PATCH] fix tulip suspend/resume
    ... > the interrupt because the tulip driver must read from a hardware register ... When the hardware has been ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • SDIO interrupts in Windows Mobile 5.0
    ... I have developed a stream device driver to a SDIO hardware but I have ... ISR) where ISR is a pointer to the function ... I do not register the interrupt in the registry as other ISR because ...
    (microsoft.public.development.device.drivers)
  • [patch 2.6.27-rc8-git] add drivers/mfd/twl4030-core.c
    ... This patch adds the core of the TWL4030 driver, ... There are some known issues with this core code. ... * often at around 3 Mbit/sec, including for interrupt handling. ... and exports register access primitives. ...
    (Linux-Kernel)