0% found this document useful (0 votes)

6 views12 pages

Some Other Nice Things

The document discusses advanced techniques in 3D coding, focusing on frame skipping to ensure consistent performance across different hardware, and optimizing assembly code for Pentium processors. It explains the concept of pairing instructions to enhance execution speed and introduces palette quantization methods for optimizing color representation in graphics. The author, Henri 'RoDeX' Tuhkanen, shares insights on coding practices and techniques for efficient memory handling and instruction pairing.

Uploaded by

Duc Le

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views12 pages

Some Other Nice Things

Uploaded by

Duc Le

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 12

3DICA v2.

22b
The Ultimate 3D Coding Tutorial (C) Ica /Hubris 1996,1997,1998
Over 150k of pure sh...er, 3d coding power !

7. Some Other Nice Things

7.1 Frame skipping

I’m describing only one technique here. If you don’t like it, try reading the
Midas documents. The idea of frameskip is to make a program run at the same
speed on all machines. The main idea is very straightforward : when the objects
have been updated once, we’ve drawn one frame. A 386 spends a lot of more
time drawing a frame than a pentium. This means that a 386 should skip some
frames and draw thus less frames per a time unit. An example : we suppose that
the rotation angle of an object should increase by 9 degrees per second. We
should now do as follows :

Frame Angle on a 386 Angle on a pentium

1 0 0

2 3 1

3 6 2

4 9 3

5 - 4

6 - 5

7 - 6

8 - 7

9 - 8

10 - 9

Why ? A pentium is fast enough to draw 10 fps when a 386 can make
only four. Now your object rotates at the same speed with all computers, just
more smoothly on a pentium than a 386.
fstart=read_time
loop
< calculate >
< draw >
< flip >
< do whatever you want >
fend=read_time
angle=angle+speed*(fend-fstart)
fstart=fend
until 1=2

All variables are of course real or fixed numbers. And read_time should
be accurate, the basic 18.2Hz clock won’t work.

Another technique : create an interrupt which updates all variables 70

times per second and the routines draw as fast as they can without going over
that 70fps. Not bad, this one, either.

7.2 Optimizing in Assembly

(
Author : Henri 'RoDeX' Tuhkanen
Email : [email protected]
Groups : CyberVision, Embrace, the Damned, Hard Spoiled, tAAt,
Regeneration, Magic Visions and a couple of others I can’t remember 
Achievements : 7th prize at asm’96 4k intro compo
A brief portrait : I’m a 19-year-old 3D/gfx coder and code about always.
I can give away a lot of coding-related material but mainly algos; I don’t like
rippers. I’m not afraid of being wrong and I want to learn everything about
coding and computers 

(text debugged by Chem and translated by Ica)

)

Optimizing assembly code became quite complicated when the pentium

came out. This piece of text was written to clarify pentium tricks and to tell how
we can produce code that can even be twice as fast as one could think to be
possible.

With pentium came the concepts pairing and fast math coprocessor
known in the PC world. Now I’ll tell how pairing works, and how you can take
the advantage of it in your own programs.
There are actually two parallel processors inside a pentium processor.
Only one of these is complete. This is called the U pipe, the other (less
complete) one is the V pipe. The V pipe can only perform jumps and some basic
commands like mov, add, and lea. Pairing is these processors working
simultaneously at the same clock frequency. Pairing works only in special
circumstances.

Example. We suppose that the situation is neutral at the beginning of each

series of commands. That means that the first command is performed in the U
pipe (this can be achieved by placing an sti before the series).
mov eax,ebx ; U
add ecx,edx ; V
mov eax,ebx ; U
add eax,2 ; U (doesn't pair -- needs the new result of eax)

So the first thing which should be correct is the use of registers.

Generally: if a register is first written and then read, the commands do not pair.
But if a register is read, the following command pairs even if it used the same
register (for read or write). The supposition of course is that the commands
should pair with each other. Example :
mov eax,ebx ; 1 clock; U
add eax,ecx ; 1 clock; U
; total 2 clocks
add ebx,eax ; 1 clock; U
mov edx,eax ; 0 clocks; V
; total 1 clock

In the commands above we’ve supposed that both of the commands pair
in both pipes. Even though this is the general situation, things are not always
like that.
Example :
mov ebx,edx ; 1
shl eax,1 ; 1 (doesn't pair in V)
; total 2
But :
shl eax,1 ; 1
mov ebx,edx ; 0
; total 1
(shl pairs only in U pipe)
cmp eax,ebx ; 1 U
jne (somewhere) ; 0 (pairs only in V)

The flags are updated so fast that they’re available for use at the same
clock, so the above example works. There are many instructions which pair only
in the U or V pipe or not at all. SHL and ROL pair only in the U pipe. Jumps
pair only in the V pipe, when CLI and MUL don't pair at all (they reserve both
pipes). When you start finding out what instructions pair with each other, you
notice that you begin forming instruction pairs and moving instructions to places
where they pair. Many instructions work depending on the situation, though.
The pairing rules can be found at least in the pentium update of HelpPC.

Now the theory should be quite clear so it’s time to try optimizing a real
piece of code. In the following example we’ll optimize a slow line drawing inner
loop.
 UV : pairs in both pipes
 NU : pairs only in U
 NV : pairs only in V
 NP : doesn’t pair at all
 ? : The speed of memory references depends on many things. We
suppose that the situation is ideal.

Linedraw, innerloop in the mode 320200256. When we come into the

loop, the registers are the following :
 eax = start_x*256
 edx = start_y
 [xp]=x_coefficient*256 (dd, 32 bit)
 [yp]=y_coefficient (dd, 32 bit)
 ebx = 0

Code :

; Clocks Pipe Pairing Comment

@@inner:

lea edi,[edx+edx4] ; 1 U UV edi=edx5

mov bl,ah ; 0 V UV ebx=ax/256

shl edi,6 ; 1 U NU edi*=64

add edi,ebx ; 1 U UV edi+=ebx

add edi,0a0000h ; 0 U UV edi+=screenstart

mov [edi],b 10d ; 1? U UV [edi]=10

add edx,[yp] ; 0? V UV edx+=[yp]

add eax,[xp] ; 1? U UV eax+=[xp]

dec ecx ; 0 U UV ecx-=1

jnz short @@inner ; 1 V NV jump if not zero

; 6?

Normally we use pmode in flat mode which means that we must add the
starting point of the screen memory into edi. The loop is best used as short
because the distance to @@inner is less than 128; we save two bytes from the
compiled version of the command. Now we can get rid of 'add edi,ebx':n by
changing the calculation to ‘mov [edi+ebx+0a0000h],b 10d’. We arrange also
the instructions in a way that they pair whenever possible.

@@inner:

lea edi,[edx+edx4] ; 1 U UV edi=edx5

mov bl,ah ; 0 V UV ebx=ax/256

shl edi,6 ; 1 U NU edi*=64

add edx,[yp] ; 0? V UV edx+=[yp]

mov [edi+ebx+0a0000h],b 10d ; 1? U UV []=10

add eax,[xp] ; 0? V UV eax+=[xp]

dec ecx ; 1 U UV ecx-=1

jnz short @@inner ; 0 V NV jump

; 4?

The difference in speed is remarkable. The negative side is that the code
gets more messy but it’s a cheap price for speed. Well, the clocks of lines where
registers are added by [variables] are very questionable and probably something
very else than zeros. In any case, the loop is now pairing efficiently. I just find it
useless to multiply y every time by 320. So :
 eax=start_x*256
 edi=start_y*320
 [xp]=x_coeff*256
 [yp]=y_coeff*320
 edi=0a0000h
 ebx=0

@@inner:

mov bl,ah ; 1 U UV ebx=ax/256

add eax,[xp] ; 0? V UV eax+=[xp]

mov [edi+ebx],b 10d ; 1? U UV []=10

add edi,[yp] ; 0? V UV edi+=[yp]

dec cx ; 1 U UV cx-=1

jnz @@inner ; 0 V NV jump if not zero

; 3?

Fast ? This is just an illusion because there are many other things which
have their effects on speed, a few examples of which are cache misses and
interrupts which stop pairing and often cause also a cache miss. A cache miss is
a situation in which the needed data is not found in the processor internal
memory (level 1 cache) but it must be brought from the external cache (level 2
cache). This usually means couple of clock cycles more and the stoppage of
pairing. If the data is not found even in the level 2 cache, it must be brought
from the actual memory resulting in a loss of over 10 clocks. These fetch times
are linearly dependent on the memory which the computer is using. In certain
cases, the differences can be big. For example, it could take 5 clocks of 60ns
multiaccess edo and 15 clocks of normal memory. With the level 2 cache, the
difference between pipeline burst and some other can be two clocks. So at this
stage we notice that everything is really not dependent on the code 

Luckily we have a trick left : we can’t affect the caches straight but we
can speed up memory handling by arranging the data. For example, variables
that are used in the same loop should be arranged so that they would be
consecutively in the memory and in blocks of 32 bytes. This is because pentium
moves data between cache and normal memory in 32-byte blocks, and a block
like this is never split. So it’s worth trying to align the code and the data to 32
bytes. Specially loops and variables that are used in them should be in as few
blocks as possible. It’s always worth using your imagination when coding
critical loops to get the best possible use of level 1 cache.

7.3 Palette Quantisizing

7.3.1 What is it actually ?

Palette quantisizing is a way of trying to get a 256 (or why not 16 )

color mode look as good as possible by optimizing the palette. There are many
quantisizing techniques of which here are described Local K Mean and Median
Cut.
7.3.2 Local K Mean
[The text was originally written by Sampsa Lehtonen (TexMex
/Gigamess, [email protected]), this is an edited version. Thanks.]

7.3.2.1 An abstract approach

Abstractly, LK works as follows : the values of the picture or colors to be

quantisized are imagined as spheres in a cubic-shaped color space (XYZ =
RGB). The bigger amount of a color there is, the bigger is the sphere. Into the
color space, we add palette spheres which can move freely contrary to the color
spheres which don’t move at all. The number of palette spheres is the same as
the size of the desired palette (256 colors -> 256 spheres).

We perform the following process : every color sphere pulls the closest
palette sphere. The bigger a color sphere is, the bigger is its pulling force. Now
about all of the palette spheres are pulled by one or more color spheres. (The
new coordinates of a palette sphere are calculated from the average of the color
values of the color spheres, in other words the sum of colors divided by the
number of color spheres). The palette spheres which are not pulled by any color
sphere telewarp near some color sphere. Now the palette spheres move in the
space like this until their movement is slowed under a defined level (trust me, it
really slows down). Now the new palette can be read from the coordinates of the
palette spheres.

7.3.2.2 A more technical approach

Let’s use the following example : we have a truecolor picture which

should be changed to 256 colors. First we create a histogram out of the picture.
The histogram uses 15bit (or any other 3*x bit) numbers.

Now we create another table in which is the list of the colors the picture
originally has. So we go through the histogram, and in every point where there is
some color (the value being greater than zero), we put the color amount and
value into this new table. The table can for example be like this :
typedef struct
{
unsigned char R,G,B; // color values
unsigned long count; // number of colors in the pic
} colorListStruct;
colorListStruct colorList[32768];

Additionally, the amount of different colors is saved into a variable

(colorListCount). Then we create a basic palette :
unsigned long palette[256][3]; // 3: R,G & B

We need also three other variables :

unsigned long colorSum[256][3]; // 256 colors, 3 = R,G & B,

(the following one could be attached to colorSum, too)

unsigned long colorCount[256],

and then a variable in which we save change in the palette:

unsigned long variance;

Now we go through the following steps :

1) Reset colorSum and colorCount (all zeros), and fill palette with the
colors at the beginning of colorList
2) Go through all colors in colorList (c = 0..colorListCount)
a) take color c from colorList
b) find the closest color in palette for it (we get a number x=0..255)
c) add this color into colorSum, for example
colorSum[x][0] += colorList[c].R;
colorSum[x][1] += colorList[c].G;
colorSum[x][2] += colorList[c].B;
d) increment colorCount at the point x (colorCount[x]++;)
3) variance=0
4) Go through all colors in the basic palette (c = 0..255)
a) if colorCount > 0 calculate the R, G, and B values with the help of
colorSum and colorCount (average color) :
R = colorSum[c][0] / colorCount[c];
etc. else take a random number from colorList
R,G & B <- colorList[RANDOM]
b) calculate the variance :
temp = abs(R-palette[c][0]); //variance in red
variance+=temp; //save it
etc.
c) save the new color :
palette[c][0] = R
etc.
5) reset colorSum and colorCount
6) if variance > MAX_VARIANCE goto 2 (MAX_VARIANCE is the
border when the palette is ready. The smaller number, the slower
process.)

7.3.3 Median Cut

Written by Jari Komppa aka Sol/Trauma ([email protected]). Thanks.

Disclaimer : The information here is based on material I’ve read (can’t

remember any document names) and some sources I’ve poked at. As such I
can’t give any pointers to more information nor do I have the math background
for this algorithm.

7.3.3.1 The definition

RGB colorspace cube

The RGB colorspace can be thought of as a 3-dimensional cube. Every

color is a 3D position within this cube. If you take every point of the RGB cube
and calculate the average, you’ll get a mid-gray color.

Since the palette we’re going to use is not going to include every color in
the colorspace, we only use a "subspace".

Colorspace with a subspace in gray

This subspace is defined by finding out the min/max R, G, and B values.

In this example, blue and red use the whole range and green uses only a part. If
we calculate the average of these colors, we can make simple b/w image out of
the rendered image by checking if a pixel is "brighter than" this averaged color,
and dotting black and white pixels accordingly. But what if we want to use more
than one color ?

Let’s say we cut the subspace in two.

Colorspace with two subspaces

Now, by averaging each subspace, we have two colors that we know are
near the right colors. At this point the target image will look pretty bad, but
after, say, 16 splits, you can already know what the image will really look like,
and at 256 levels you can hardly see the difference (well ok, depends heavily on
the image).

The example image may give you slightly wrong idea of the algorithm.
There will be a gap between the new subspaces, and they will most probably
shrink in other dimensions as well.

7.3.3.2 The algorithm

Basically we do the following:

1. Analyze subspaces
2. Select the largest subspace
3. Sort it by the largest component (R, G or B)
4. Cut it into two subspaces
5. Repeat 1-4 until we have as many subspaces as we want.
6. Average all colors in each subspace into palette

Let’s check each (actually rather simple) step one by one.

1. Analyze subspaces
Find out information about the subspace that we need. This
practically means checking what color component (R, G or B) has
largest range (max-min) and what is the size of the subspace. This
size may mean the number of colors in the subspace, sum of the
values of the biggest component, or just the largest component
range. You’ll need the sum of the values of the biggest component
later in any case.
2. Select largest subspace
This is simple; just check the values from each subspace and find
out which is the largest.
3. Sort it by the largest component
This is by far the most power-eating part of this algorithm. You’ll
need to sort the colors by the component you found out in the
analyze phase.
4. Cut it in two subspaces
How you do this depends on your way of implementing the whole
thing. You might cut a linked list into two, or just define that the old
subspace ends in index N and the new one starts at N+1. The tricky
thing is to know where to cut.

You could cut the subspace in half (by leaving as many colors on
one side as the other, or by leaving as many color intensities on one
side as the other), but we’ll cut it on the median of the color values.
You calculated the sum of all color values of the component you
sorted the subspace by.
Now you’ll need to find the median, or, the position where the sum
of values reaches the middle point of the whole sum.

Example : We have the following values : 1, 5, 7, 9, 10, 11, 17, 21. The
sum of these is 81. Half of the sum is 40.5. Now to find the median, we’ll start
calculating a new sum, until it reaches 40.5: 1+5+7+9+10+11=43, so the first
half will have 5 values and the second, 3.

5. Repeat 1-4 until we have as many subspaces as we want.

As we now have one new subspace, we need to analyze it, and
reanalyze the one it was clipped from. When we have enough
subspaces for our needs, we go on.
6. Average all colors in each subspace into palette
Just plain & simple averaging.

7.3.3.3 Implementation hints

There are some problems, workarounds, and speed issues.

Since many pictures tend to have black in them, and if we are using VGA,
we usually like to have the color 0 to be black, so the borders won’t annoy us
too much. I solved this by separating all black colors from the input values into
separate list, and forced the color 0 to be black. Otherwise there will not be a
completely black color !

If you wish to remap graphics (sprites, textures, whatever), performing a

nearest-color search for every color you put into reduction is a bit waste. You
could, instead, drag the original color index with the color itself (so that it will
be sorted with the original color etc.) and then when averaging the subspaces
you’d directly know the color indices that should be of that color.

Remember not to do any things repeatedly without a reason. (After

analyzing, store the values and reanalyze only if there is need; also, sort only if
it really is needed).

The biggest power-consuming part (about 60%+ in my implementation) in

the algorithm is the sort. Killing off duplicates might help, but I didn’t bother
trying that.

My implementation used a singly-linked list for the colors since I wanted

it to be dynamic. There were couple problems with this approach : allocating
and freeing memory for 250000+ colors took more time than the color reduction
itself, which was easily solved by allocating enough memory for one whole
colormap at a time. Another problem was sorting, which I solved by making a
list of pointers and re-linking the list after the sort. Making a static list of colors
should be just as easy, and quite probably faster. I used radix sorting. On a p150
it reduced 640*400 colors into 255 in about 4.5 seconds. Coding it took about
one day (most of it wasted debugging).

Online Agriculture Products Store - 1
100% (7)
Online Agriculture Products Store - 1
53 pages
Cs-II Journal Practicals All
No ratings yet
Cs-II Journal Practicals All
38 pages
Os Notes Mit
No ratings yet
Os Notes Mit
9 pages
Grade 3 Reading Comprehension Workbook
13% (8)
Grade 3 Reading Comprehension Workbook
3 pages
BPLCK205B Module III Important Questions
No ratings yet
BPLCK205B Module III Important Questions
8 pages
Arm Isa
No ratings yet
Arm Isa
65 pages
Base Advanced - Optimizing (Codebase 64 Wiki)
No ratings yet
Base Advanced - Optimizing (Codebase 64 Wiki)
22 pages
U1 - 8051 ALP Instructions
No ratings yet
U1 - 8051 ALP Instructions
70 pages
Microprocessors and Microcontrollers-2
No ratings yet
Microprocessors and Microcontrollers-2
141 pages
LAB1
No ratings yet
LAB1
14 pages
25-ARM 7 Assembly Programming-22-03-2024
No ratings yet
25-ARM 7 Assembly Programming-22-03-2024
24 pages
Chapter 08
No ratings yet
Chapter 08
16 pages
Lecture01 Intro
No ratings yet
Lecture01 Intro
67 pages
Coal 5,6,7
No ratings yet
Coal 5,6,7
13 pages
NET3001 4 AdvAsm
No ratings yet
NET3001 4 AdvAsm
43 pages
Lab 14 Sol
No ratings yet
Lab 14 Sol
5 pages
NET3001 2 Asm
No ratings yet
NET3001 2 Asm
40 pages
Document 25
No ratings yet
Document 25
16 pages
Midterm 1 Solution
No ratings yet
Midterm 1 Solution
6 pages
x86 Proccessors
100% (2)
x86 Proccessors
385 pages
Lecture5 INSTRUCTIONS MICROPROCESSOR APLICATIONS
No ratings yet
Lecture5 INSTRUCTIONS MICROPROCESSOR APLICATIONS
58 pages
Why Assembly Language?
No ratings yet
Why Assembly Language?
74 pages
Sehs3317 L4
No ratings yet
Sehs3317 L4
53 pages
Mic
No ratings yet
Mic
18 pages
Lecture 08
No ratings yet
Lecture 08
17 pages
Unit 02
No ratings yet
Unit 02
41 pages
MIT6 828F12 Lec2 Notes
No ratings yet
MIT6 828F12 Lec2 Notes
8 pages
Lab Manual 3
No ratings yet
Lab Manual 3
7 pages
Ec8681-Microprocessors and Microcontrollers Lab-1732123961-Cse IV MP Lab Manual Fin
No ratings yet
Ec8681-Microprocessors and Microcontrollers Lab-1732123961-Cse IV MP Lab Manual Fin
114 pages
2 GA3 Cheat Sheet
No ratings yet
2 GA3 Cheat Sheet
2 pages
Spring 2025 - CS601 - 2 - BC240438666
No ratings yet
Spring 2025 - CS601 - 2 - BC240438666
4 pages
Lab Session 3
No ratings yet
Lab Session 3
11 pages
ARMv7 Reference
No ratings yet
ARMv7 Reference
7 pages
Mic Viva Code
No ratings yet
Mic Viva Code
15 pages
DucHuy CA Lab2 2021
No ratings yet
DucHuy CA Lab2 2021
25 pages
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: Fadel AL-aqawa 2010-2011
No ratings yet
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: Fadel AL-aqawa 2010-2011
19 pages
Embedded Lab Experiment Program
No ratings yet
Embedded Lab Experiment Program
30 pages
ES LAB Programs
No ratings yet
ES LAB Programs
29 pages
Experiment 04 EEE 3210
No ratings yet
Experiment 04 EEE 3210
8 pages
Arithmetic Instructions
No ratings yet
Arithmetic Instructions
100 pages
MPMC LAB Manual R20
No ratings yet
MPMC LAB Manual R20
49 pages
Section A: COA Comprehensive Makeup Examination 2014. Answer Key
No ratings yet
Section A: COA Comprehensive Makeup Examination 2014. Answer Key
5 pages
Advanced Microcontroller: A Laboratory Manual For
No ratings yet
Advanced Microcontroller: A Laboratory Manual For
84 pages
Analog Commn Lab2 PDF
No ratings yet
Analog Commn Lab2 PDF
41 pages
شيت مختبر المايكرو
No ratings yet
شيت مختبر المايكرو
51 pages
8051 Assembly Language
No ratings yet
8051 Assembly Language
39 pages
FULL REPORT Contoh MicroP
No ratings yet
FULL REPORT Contoh MicroP
38 pages
Subject:: Microcontroller and Embedded Systems Laboratory 18CSL48 2 0:2:2
No ratings yet
Subject:: Microcontroller and Embedded Systems Laboratory 18CSL48 2 0:2:2
17 pages
Whirlwind Tour of ARM Assembly
No ratings yet
Whirlwind Tour of ARM Assembly
38 pages
Snake Game Design Document
No ratings yet
Snake Game Design Document
5 pages
Assembler Programming Using Debug
100% (2)
Assembler Programming Using Debug
16 pages
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: 2010-2011
No ratings yet
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: 2010-2011
20 pages
COAL Final - Fall 2023 - FAST LHR
No ratings yet
COAL Final - Fall 2023 - FAST LHR
22 pages
Microcontroller 8051
No ratings yet
Microcontroller 8051
72 pages
102 Calculus (Lecture Note) Limit and Continuity
100% (1)
102 Calculus (Lecture Note) Limit and Continuity
6 pages
ARM Assembly Language Guide: Common ARM Instructions (And Psuedo-Instructions)
No ratings yet
ARM Assembly Language Guide: Common ARM Instructions (And Psuedo-Instructions)
7 pages
Balanceo PCC v7
100% (1)
Balanceo PCC v7
4 pages
Microprocessor Lab Manual
No ratings yet
Microprocessor Lab Manual
74 pages
Lab Manual MC &MP
No ratings yet
Lab Manual MC &MP
11 pages
Sick Major Test Memo
No ratings yet
Sick Major Test Memo
6 pages
Table 1a: The Complete MSP430 Instruction Set of 27 Core Instructions
No ratings yet
Table 1a: The Complete MSP430 Instruction Set of 27 Core Instructions
9 pages
Architecture Overview Diagram
No ratings yet
Architecture Overview Diagram
12 pages
Jio Vs Airtel Project Final
No ratings yet
Jio Vs Airtel Project Final
70 pages
CSE 30321 - Lecture 02-03 - in Class Example Handout: Discussion - Overview of Stored Programs
No ratings yet
CSE 30321 - Lecture 02-03 - in Class Example Handout: Discussion - Overview of Stored Programs
8 pages
Sheet Micor
No ratings yet
Sheet Micor
42 pages
1.ION9000 Technical Datasheet - Class 0.1S - 1024 Samples Per Cycle
No ratings yet
1.ION9000 Technical Datasheet - Class 0.1S - 1024 Samples Per Cycle
12 pages
8 Steps To A DDoS Mitigation Plan - GD 1
No ratings yet
8 Steps To A DDoS Mitigation Plan - GD 1
2 pages
Remote Procedure Call (RPC)
No ratings yet
Remote Procedure Call (RPC)
50 pages
R Unit 2 Notes
No ratings yet
R Unit 2 Notes
14 pages
Grade 9 Pre June 2024 Marking Guidelines
No ratings yet
Grade 9 Pre June 2024 Marking Guidelines
10 pages
Gaming PC Components and Their Specifications: Personal Professional Development 2 Research Skills
No ratings yet
Gaming PC Components and Their Specifications: Personal Professional Development 2 Research Skills
7 pages
The Ribbons - MS Word Review 12345
No ratings yet
The Ribbons - MS Word Review 12345
14 pages
Chapter 5 - GEE 4 - MIDTERM PROJECT
No ratings yet
Chapter 5 - GEE 4 - MIDTERM PROJECT
4 pages
Cand's Pack
No ratings yet
Cand's Pack
8 pages
Bumps and Pothole Detection Report Final
No ratings yet
Bumps and Pothole Detection Report Final
64 pages
Infineon-AN87216 Designing A GPIF II Master Interface-ApplicationNotes-V05 00-En
No ratings yet
Infineon-AN87216 Designing A GPIF II Master Interface-ApplicationNotes-V05 00-En
31 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Web Based Event Management System EMS
No ratings yet
Web Based Event Management System EMS
4 pages
PHP Lab Programs
No ratings yet
PHP Lab Programs
55 pages
Codigo Fuente 3
No ratings yet
Codigo Fuente 3
6 pages
Title Project For Proposed by Adugna and Kalkidan
No ratings yet
Title Project For Proposed by Adugna and Kalkidan
8 pages
Experiment-1: AIM: To Study Installation of Oracle9i
No ratings yet
Experiment-1: AIM: To Study Installation of Oracle9i
13 pages
SDLC Jspider
No ratings yet
SDLC Jspider
2 pages
One Minute Academy Student Handbook (English)
No ratings yet
One Minute Academy Student Handbook (English)
29 pages
Apacer UH110 UFD1 BiCS5 AN2 118XXG XXX21 Spec v1 1-3107181
No ratings yet
Apacer UH110 UFD1 BiCS5 AN2 118XXG XXX21 Spec v1 1-3107181
17 pages
Login e Senha 2024
No ratings yet
Login e Senha 2024
5 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-N
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-N
3 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet

Some Other Nice Things

Uploaded by

Some Other Nice Things

Uploaded by

3DICA v2.

7. Some Other Nice Things

Frame Angle on a 386 Angle on a pentium

Another technique : create an interrupt which updates all variables 70

7.2 Optimizing in Assembly

(text debugged by Chem and translated by Ica)

Optimizing assembly code became quite complicated when the pentium

Example. We suppose that the situation is neutral at the beginning of each

So the first thing which should be correct is the use of registers.

Linedraw, innerloop in the mode 320*200*256. When we come into the

; Clocks Pipe Pairing Comment

lea edi,[edx+edx*4] ; 1 U UV edi=edx*5

mov bl,ah ; 0 V UV ebx=ax/256

shl edi,6 ; 1 U NU edi*=64

add edi,ebx ; 1 U UV edi+=ebx

add edi,0a0000h ; 0 U UV edi+=screenstart

mov [edi],b 10d ; 1? U UV [edi]=10

add edx,[yp] ; 0? V UV edx+=[yp]

add eax,[xp] ; 1? U UV eax+=[xp]

dec ecx ; 0 U UV ecx-=1

jnz short @@inner ; 1 V NV jump if not zero

lea edi,[edx+edx*4] ; 1 U UV edi=edx*5

mov bl,ah ; 0 V UV ebx=ax/256

shl edi,6 ; 1 U NU edi*=64

add edx,[yp] ; 0? V UV edx+=[yp]

mov [edi+ebx+0a0000h],b 10d ; 1? U UV []=10

add eax,[xp] ; 0? V UV eax+=[xp]

dec ecx ; 1 U UV ecx-=1

jnz short @@inner ; 0 V NV jump

mov bl,ah ; 1 U UV ebx=ax/256

add eax,[xp] ; 0? V UV eax+=[xp]

add edi,[yp] ; 0? V UV edi+=[yp]

jnz @@inner ; 0 V NV jump if not zero

7.3 Palette Quantisizing

7.3.1 What is it actually ?

Palette quantisizing is a way of trying to get a 256 (or why not 16 )

7.3.2.1 An abstract approach

Abstractly, LK works as follows : the values of the picture or colors to be

7.3.2.2 A more technical approach

Let’s use the following example : we have a truecolor picture which

Additionally, the amount of different colors is saved into a variable

We need also three other variables :

unsigned long colorSum[256][3]; // 256 colors, 3 = R,G & B,

(the following one could be attached to colorSum, too)

unsigned long colorCount[256],

and then a variable in which we save change in the palette:

unsigned long variance;

Now we go through the following steps :

7.3.3 Median Cut

Disclaimer : The information here is based on material I’ve read (can’t

7.3.3.1 The definition

RGB colorspace cube

The RGB colorspace can be thought of as a 3-dimensional cube. Every

Colorspace with a subspace in gray

This subspace is defined by finding out the min/max R, G, and B values.

Let’s say we cut the subspace in two.

7.3.3.2 The algorithm

Basically we do the following:

Let’s check each (actually rather simple) step one by one.

5. Repeat 1-4 until we have as many subspaces as we want.

7.3.3.3 Implementation hints

There are some problems, workarounds, and speed issues.

If you wish to remap graphics (sprites, textures, whatever), performing a

Remember not to do any things repeatedly without a reason. (After

The biggest power-consuming part (about 60%+ in my implementation) in

My implementation used a singly-linked list for the colors since I wanted

You might also like

Linedraw, innerloop in the mode 320200256. When we come into the

lea edi,[edx+edx4] ; 1 U UV edi=edx5

lea edi,[edx+edx4] ; 1 U UV edi=edx5