Re: Windows Assembly



Richard Cooper wrote:

> I don't see how it could get worse than twice as many. Instead of
> just calling the kernel, and making one context switch, you call the
> kernel (for some IPC thing) and it sends it to the other process,
> which is two context switches. Just double the size of your buffers
> and you're back to the same number of context switches.

Assuming you send stuff that can be batched.

> Personally, if I wrote an OS I'd do everything in userspace, but so
> long as Linux is the way it is, then the video should be in the
> kernel just like everything else is.

Hm, video drivers in userspace too? Either you'd need a context-switch every
time you need to do port I/O, or you would need to allow usermode apps to do
port I/O. Hm. Of course it can be allowed per-process if you set up a TSS
and iopbm for each process, but... humm.

>> Well, the VT-100 emulators would use the console subsystem which in
>> turn would communicate with the video driver - but yes.
>
> I just don't see why the VT-100 emulator needs to be in the kernel.

And indeed it shouldn't. Just because we have a core OS subsystem doesn't
mean it necessarily has to be implemented in kernel-mode.

> Of course, half of those problems would disappear if the kernel didn't
> send delete when you press backspace.

Doh :)

>> implemented via hooks and whatnot to make it generic enough for
>> not just X to use it.
>
> In the kernel source there's a comment that says something like "I'm
> not sure what this is supposed to do, but X seems to want it to do
> this, so that's what I made it do." The kernel shouldn't be designed
> for X.

Couldn't agree more with you. Generic hooks are okay, but the intention of
those should be to support multiple stuff - not hacky kludges to make one
system work.

> For example, if you're writing a 256 color application, just make a
> linear buffer that's one byte per pixel and make everything draw into
> that, then when you're done, have a function that converts that to
> whatever the video card actually uses, wether it be a 8/15/16/24/32
> bit color, or even something as messed up as VGA's four plane mode
> that you have to use with the 320x240x256 mode. I think that's
> easier than writing one line drawing algorithm or texture mapping
> function that works on 8 bit modes, another that works on 16 bit
> modes, another that works on 24 bit modes, and another that works on
> that four plane memory configuration.

Well, yes. I'm probably a bad person for assuming that the hardware my stuff
runs on will suport 32bpp modes :). When I need to support "clunkier"
hardware, I usually do it your way too.

> Of course, you could just require a 24-bit RGB linear framebuffer and
> bomb out if the video card says it can only do 24-bit BGR linear or
> 24-bit RGB non-linear, but that's not very user friendly.

:)

> It may even be beneficial to use the rep movsd instruction for the
> ram to video ram copy, since I know that the processor does some
> optimizations for those moves, like since it knows that it's doing a
> large block, it reads and writes 16 bytes at a time or something like
> that. I haven't tried this myself though, since Softer has to
> convert to funny VGA formats along the way and so it just reads and
> writes dwords at a time.

Video memory *is* bad and slow - and readback is even slower than writing
(the PCI express architecture promises to fix this, but it probably won't
happen in the first couple generations of video cards). When blitting a
sysmem buffer to vidmem, you'll probably want to use the uncached
instructions added with... was it SSE or SSE2? (if available, of course).

> Now if you're not updating the entire screen each frame, then copying
> the rest of that buffer is a waste of time. Softer overcomes this by
> keeping another buffer of what's in video ram, and comparing each
> dword that it's about to write to video ram to what is in that
> buffer, and if it's the same, then it doesn't write that dword to
> video ram.

It might be possible to speed this up by adding some "line-changed" or even
"tile-changed" logic? Softer is a terminal emulator, right? So comparing
character tiles instead of pixel values could perhaps speed up stuff?

> There is the vm86 system call. The problem is that the man page for
> it only mentions that it exists and that it takes a couple of
> structures as parameters, it doesn't detail how to use it. I looked
> at the structures, as usual the individual fields in the structures
> are named with mystery TLAs and there's no indication anywhere in the
> header file what sort of things I might need to put into those fields.
>
> In all likelyhood the call was only added to the kernel because
> someone working on dosemu needed it, and so no one ever bothered to
> document it since the dosemu guys already knew how to use it since
> they designed it.

:-)


.



Relevant Pages

  • tsc problems in 2.6.0-test11
    ... I'm not using speedstep or any voltage altering cpu code in the kernel. ... Using tsc for high-res timesource ... Doing some strenuous video ... # ACPI Support ...
    (Linux-Kernel)
  • Update on Timer Frequencies
    ... world experience comparing the 2.6.14 kernel compiled with the timer ... I'm running a video server that is taking in and sending out ... it's not a perfect test to evaluate timer frequencies (because it doesn't ... Each test is really a sequence of scripts that a) run dd on a trio of one ...
    (Linux-Kernel)
  • Re: Windows Assembly
    ... rather than constructing a proper kernel-mode video driver ... everything in the kernel, and some want to put everything in user-space - I ... IMHO video drivers belong in the kernel, so that the drivers can take ... the VT-100 emulators would use the console subsystem which in turn ...
    (alt.lang.asm)
  • Re: best framebuffer for YD 3.0.1 on 604e?
    ... > I boot her up? ... the kernel arguments only affect - ... video and XFree86 video, and ... failsafe XF86Config and then tweak it to explicitly define the BusID ...
    (comp.os.linux.powerpc)
  • Re: Merging relayfs?
    ... > kernel to implement speculative tracing, ... relayfs was prepared for low latency on move data outside kernel space, ... getting data from probes do not require organize all them in regular ... Only in all cases where buffer must be neccessarly moved outside kernel ...
    (Linux-Kernel)