Assembly (8) A more flexible method of locating memory addresses

Hits: 0

Article directory

foreword

Earlier, we used the method of [0], [bx] to locate the address of the memory unit in the instruction to access the [memory .] In this blog post, we mainly explain some more flexible methods of locating memory addresses and related programming methods.

and and or directives

and instruction: [logical AND] instruction, bitwise AND operation

For example command:

mov  al, 01100011B 
and  al, 00111011B

After execution: al = 00100011B;

Through this instruction, the corresponding bit of the operation object can be set to 0, and other bits remain unchanged.

Eg:

  • Set bit 6 of al to 0: and al, 10111111B;
  • Set bit 7 of al to 0: and al, 01111111B;
  • Set bit 0 of al to 0: and al, 11111110B;

or instruction: logical OR instruction, bitwise OR operation

For example command:

mov al, 01100011B 
or al, 00111011B

After execution: al = 01111011B;

Through this instruction, the corresponding bit of the operation object can be set to 1, and other bits remain unchanged.

Eg:

  • Set bit 6 of al to 1: and al, 01000000B;
  • Set bit 7 of al to 1: and al, 10000000B;
  • Set bit 0 of al to 1: and al, 00000001B;

About ASCII codes

There are many encoding schemes in the world, and there is a scheme called ASCII encoding, which is usually used in computer systems.

Simply put, the so-called coding scheme is a set of rules, which stipulates what kind of information is used to represent real objects.

For example, in the ASCII encoding scheme, use 61H for “a” and 62H for “b”.

A rule needs to be followed to make sense.

A text editing process includes encoding and decoding according to ASCII encoding rules.

In the process of text editing, we press the a key of the keyboard, and we will see “a” on the screen, but what kind of process is this?

It is to provide the ASCII code of “a” to the graphics card, 61H, which is written into the video memory;

data given as characters

In the assembler, we '…'can indicate that the data is given in the form of characters, and the compiler will convert them into the corresponding ASCII codes, as follows:

assume ds:data
data segment
 db 'unIX' 
 db 'foRK'
data ends
code segment
  start:mov al,'a'
        mov bl,'b'
        mov ax,4c00h
        int 21h
code ends
end start

In the above source program:

  • db 'unIX'Equivalent to db 75H,6EH,49H,58H, uthe ASCII codes of , , , are 75H, 6EH, 49H, 58H nrespectively I;X
  • db 'foRK'Equivalent to db 66H,6FH,52H,4BH, fthe ASCII codes of , , , are 66H, 6FH, 52H, 4BH orespectively R;K
  • mov al,'a'Equivalent to mov al,61H, athe ASCII code is 61H;
  • mov al,'b'Equivalent to mov al,62H, bthe ASCII code is 62H;

Case conversion problem

First of all, we know that the ASCII codes corresponding to uppercase characters and lowercase characters of the same letter are different. For example, the ASCII code of “A” is 41H, and the ASCII code of “a” is 61H.

To change the case of a letter is actually to change its corresponding ASCII code.

We can list the ASCII codes corresponding to the uppercase characters and lowercase characters of all letters, compare them, and find the rules.

Uppercase         Binary Lowercase Binary 
 A          01000001 a 01100001 
 B          01000010 b 01100010 
 C          01000011 c 01100011 
 D          01000100 d 01100100

By comparison, we can see that the ASCII code value of lowercase letters is 20H larger than the ASCII code value of uppercase letters.

In this way, we can think that if we subtract 20H from the ASCII code value of “a”, we can get “A”; if we add 20H to the ASCII code value of “A”, we can get “a”.

Since I haven’t learned the judgment command before, I can only change it in another way. Observe that in terms of the binary form of the ASCII code, except for the 5th digit (the number of digits is counted from 0), all other uppercase letters and lowercase letters are all letters. Same.

The 5th digit of the ASCII code of uppercase letters (the number of digits is counted from 0) is 0, and the 5th digit of lowercase letters is 1; therefore, as long as this rule is grasped, case conversion can be performed, and the andor orinstruction is used here to realize the operation;

[bx+idata]

Earlier, we can use [bx] to specify a memory unit, and we can also specify a memory unit in a more flexible way:

[bx+idata] represents a memory unit whose offset address is (bx)+idata (the value in bx plus idata).

Let’s take mov ax,[bx+200]a look at what the means:

  • Send the contents of a memory unit into ax, the length of this memory unit is 2 bytes (word unit), store a word, the offset address is the value in bx plus 200, and the segment address is in ds.
  • (ax) = ((ds)*16+(bx)+200)

Instructions mov ax,[bx+200]can also be written in the following format (commonly used):

  • mov ax,[200+bx]
  • mov ax,200[bx]
  • mov ax,[bx].200

question

Use Debug to view the memory, and the results are as follows:
2000:1000 BE 00 06 00 00 00 …
Write out the contents of ax, bx, and cx after the following program is executed.

mov ax,2000H
 mov ds,ax
 mov bx,1000H
 mov ax,[bx]
 mov cx,[bx+1]
 add cx,[bx+2]

  • mov ax,[bx]The segment address of the accessed word unit is in ds, that is (ds)=2000H; the offset address is in bx, (bx)=1000H; after the instruction is executed (ax)=00BEH;
  • mov cx,[bx+1]The segment address of the accessed word unit is in ds, (ds)=2000H; offset address (bx)+1=1001H; after the instruction is executed (cx)=0600H;
  • add cx,[bx+2]The segment address of the accessed word unit is in ds, (ds)=2000H; offset address (bx)+2=1002H; after the instruction is executed (cx)=0606H;

With [bx+idata] this way of representing memory cells, we can use a higher-level structure to look at the data to be processed.

Use [bx+idata] to process arrays

Fill in the code in codesg, convert the first string defined in datasg to uppercase, and convert the second string to lowercase.

assume cs:codesg,ds:datasg
datasg segment
db 'BaSiC'
db 'MinIX'
datasg ends

codesg segment
 start: …
codesg ends
end start

According to the original method, use [bx] to locate the characters in the string.

mov ax,datasg
       mov ds,ax    
       mov bx,0 
       mov cx,5         
    s: mov al,[bx]      
       and al,11011111b     
       mov [bx],al  
       inc bx           
       loop s
       mov bx,5
       mov cx,5     
   s0: mov al,[bx]
       or al,00100000b      
       mov [bx],al
       inc bx
       loop s0

Now that we have the [bx+idata] method, we can use a more simplified method to complete the above program.

We observe two strings in the datasg section, one starts at address 0 and the other starts at address 5.

We can think of these two strings as two arrays, one starts at address 0 and the other starts at 5.

Then we can use [0+bx] and [5+bx] to locate characters in both strings in the same loop.

Here, 0 and 5 give the starting offset address of the two strings, and bx gives the relative address from the starting offset address.

The starting addresses of these two strings in memory are not the same, but, for each character in them, the relative address change from the starting address is the same.

Improved program:

mov ax,datasg
    mov ds,ax
    mov bx,0

    mov  cx,5 
s :  mov al,[bx] ;locate the character of the first string 
    and  al,11011111b 
    mov  [bx],al 
    mov  al,[5+bx] ;locate the character of the second string 
    or  al ,00100000b 
    mov  [5+bx],al 
    inc  bx 
    loop  s

SI sum DI

SI and DI are registers with similar functions to bx in 8086CPU, but SI and DI cannot be divided into two 8-bit registers for use.

The following three sets of instructions implement the same function:

(1)

mov bx,0
mov ax,[bx]

(2)

mov si,0
mov ax,[si]

(3)

mov di,0
mov ax,[di]

Use registers SI and DI to copy the string ‘welcome to masm!’ into the data area after it;

assume cs:codesg,ds:datasg
datasg segment 
  db 'welcome to masm!'
  db '................'
datasg ends

analyze:

Most of the programs we write are data processing, and data is stored in memory, so we must first figure out where the data is stored before processing the data, that is, the memory address of the data.

Because “welcome to masm!” is stored from offset address 0 and has a length of 16 bytes, the offset address of the data area behind it is 16, which is the space for the string to be stored.

Use ds:sito point to the source string to be copied, use ds:dito point to the copy destination space, and then use a loop to complete the copy.

codesg segment
start: mov ax,datasg
         mov ds,ax
         mov si,0
         mov di,16
         mov cx,8
    s:  mov ax,[si]
         mov [di],ax
         add si,2
         add di,2
         loop s

         mov ax,4c00h
         int 21h
codesg ends
end start

Note: In the program, 16-bit registers are used for data transfer between memory units, 2 bytes are copied at a time, and a total of 8 cycles are repeated.

Of course, you can also use [bx(si/di)+idata]to make the program more concise:

codesg segment
start: mov ax,datasg
         mov ds,ax
         mov si,0
         mov cx,8
    s:  mov ax,0[si]
         mov 16[si],ax
         add si,2
         loop s
         mov ax,4c00h
         int 21h
codesg ends
end start

[bx+si] sum [bx+di]

[bx+si]Represents a memory unit whose offset address is (bx)+(si) (that is, the value in bx plus the value in si);

mov ax,[bx+si]The mathematical description of the instruction is: (ax)=( (ds)*16+(bx)+(si) );

The command can also be written in the following format (commonly used): mov ax,[bx][si];

Use Debug to view the memory, the results are as follows:

2000:1000 BE 00 06 00 00 00 ……

Write out the contents of ax, bx, and cx after the following program is executed:

mov ax,2000H
mov ds,ax
mov bx,1000H
mov si,0
mov ax,[bx+si]
inc si
mov cx,[bx+si]
inc si
mov di,si
add cx,[bx+di]

Parse:

mov ax,[bx+si]

The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(si)=1000H;
after instruction execution (ax)=00BEH;

mov cx,[bx+si]:
The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(si)=1001H;
after the instruction is executed (cx)=0600H;

add cx,[bx+di]:
The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(di)=1002H;
after the instruction is executed (cx)=0606H;

[bx+si+idata] sum [bx+di+idata]

[bx+si+idata][bx+di+idata]and have similar meanings, take [bx+si+idata]as an example:

[bx+si+idata]Represents a memory unit whose offset address is (bx)+(si)+idata(that is, the value in bx plus the value in si plus idata);

mov ax,[bx+si+idata]Meaning of the command :

Send the content of a memory unit into ax, the length of this memory unit is 2 bytes (word unit), store a word, the offset address is the value in bx plus the value in si plus idata, the segment address is in in ds.

The mathematical description is: (ax)=( (ds)*16+(bx)+(si)+idata )

The command can also be written in the following format (commonly used):

mov ax, [bx + 200 + si]  
mov ax, [200 + bx + si]  
mov ax, 200 [ bx ] [ yes ]  
mov ax, [bx] .200 [si]  
mov ax, [ bx ] [ si ] .200

Use Debug to view the memory, the results are as follows:

2000 :1000  BE 00 06 00 6 A 22 ……

Write out the contents of ax, bx, and cx after the following program is executed:

mov ax,2000H
 mov ds,ax
 mov bx,1000H
 mov si,0
 mov ax,[bx+2+si]
 inc si
 mov cx,[bx+2+si]
 inc si
 mov di,si
 mov ax,[bx+2+di]

Parse:

mov ax,[bx+2+si]
The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(si)+2=1002H;
after instruction execution (ax)=0006H;

mov ax,[bx+2+si]
The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(si)+2=1003H;
after instruction execution (cx)=006AH;

mov ax,[bx+2+si]
The segment address of the accessed word unit is in ds, (ds)=2000H;
offset address = (bx)+(si)+2=1004H;
after instruction execution (cx)=226AH;

Flexible application of different addressing modes

If we compare several methods of locating memory addresses (which can be called addressing modes) used earlier, we can find the following methods:
(1) [iata]A constant is used to represent the address, which can be used to directly locate a memory unit ;
(2) [bx]A variable is used to represent the memory address, which can be used to indirectly locate a memory unit;
(3) [bx+idata]A variable and constant are used to represent the address, and a variable can be used to indirectly locate a memory unit on the basis of a starting address;
(4) ) [bx+si]represents the address with two variables;
(5) represents the address [bx+si+idata]with two variables and a constant;

Summary: From [idata]until [bx+si+idata], we can use a more flexible way to locate the address of a memory unit. This allows us to look at the data we are dealing with from a more structured perspective.

You may also like...

Leave a Reply

Your email address will not be published.